Skip to content

Commit

Permalink
Fixes neo4j-contrib#3937: Fully virtual graphs (neo4j-contrib#4043)
Browse files Browse the repository at this point in the history
* Fixes neo4j-contrib#3937: Fully virtual graphs

* removed unused imports and updated extended.txt
  • Loading branch information
vga91 authored Apr 17, 2024
1 parent b0fe5ef commit 70dc75b
Show file tree
Hide file tree
Showing 13 changed files with 642 additions and 0 deletions.
2 changes: 2 additions & 0 deletions docs/asciidoc/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ include::partial$generated-documentation/nav.adoc[]
** xref::cypher-execution/cypher-based-procedures-functions.adoc[]
** xref::cypher-execution/parallel.adoc[]
* xref:virtual-nodes-and-relationships/index.adoc[]
* xref:virtual-resource/index.adoc[]
* xref:nlp/index.adoc[]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
= apoc.graph.filterProperties
:description: This section contains reference documentation for the apoc.graph.filterProperties function.

label:function[] label:apoc-extended[]

[.emphasis]
----
apoc.graph.filterProperties(anyEntityObject, nodePropertiesToRemove, relPropertiesToRemove)

Aggregation function which returns an object {node: [virtual nodes], relationships: [virtual relationships]} without the properties defined in nodePropertiesToRemove and relPropertiesToRemove
----

== Signature

[source]
----
apoc.graph.filterProperties(value :: ANY?, nodePropertiesToRemove :: MAP?, relPropertiesToRemove :: MAP?) :: ANY?
----

The `nodePropertiesToRemove` and `relPropertiesToRemove` parameter are maps
with key the label/relationship type and value the list of properties to remove from the virtual entities.
The key can also be `_all`, for both of them, which means that the properties of each label/rel-type are filtered.


== Usage Examples


Given the following dataset:
[source,cypher]
----
CREATE (:Person {name: "foo", plotEmbedding: "11"})-[:REL {idRel: 1, posterEmbedding: "33"}]->(:Movie {name: "bar", plotEmbedding: "22"}),
(:Person {name: "baz", plotEmbedding: "33"})-[:REL {idRel: 1, posterEmbedding: "66"}]->(:Movie {name: "ajeje", plotEmbedding: "44"})
----

we can execute:

[source,cypher]
----
MATCH path=(:Person)-[:REL]->(:Movie)
WITH apoc.graph.filterProperties(path, {Movie: ['posterEmbedding'], Person: ['posterEmbedding', 'plotEmbedding', 'plot', 'bio']}) as graph
RETURN graph.nodes AS nodes, graph.relationships AS relationships
----

.Results
[opts="header",cols="2"]
|===
| nodes | relationships
| [(:Person {name: "1"}), (:Movie {name: "bar"}), (:Movie {title: "1",tmdbId: "ajeje"}), (:Person {name: "baz"}), (:Person {name: "uno"}), (:Movie {name: "ajeje"}), (:Movie {title: "1",tmdbId: "due"}), (:Movie {title: "1",tmdbId: "ajeje"}), (:Person {name: "1"}), (:Movie {title: "1",tmdbId: "ajeje"}), (:Person {name: "foo"}), (:Person {name: "1"})] | [[:REL], [:REL {idRel: 1}], [:REL {idRel: 1}], [:REL], [:REL], [:REL]]│
|===

or:

[source,cypher]
----
MATCH path=(:Person)-[:REL]->(:Movie)
WITH apoc.graph.filterProperties(path, {_all: ['plotEmbedding', 'posterEmbedding', 'plot', 'bio']}) as graph
RETURN graph.nodes AS nodes, graph.relationships AS relationships
----

with the same result as above.



Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
= apoc.graph.filterProperties
:description: This section contains reference documentation for the apoc.graph.filterProperties procedure.

label:procedure[] label:apoc-extended[]

[.emphasis]
----
CALL apoc.graph.filterProperties(anyEntityObject, nodePropertiesToRemove, relPropertiesToRemove) YIELD nodes, relationships

Returns a set of virtual nodes and relationships without the properties defined in nodePropertiesToRemove and relPropertiesToRemove
----

== Signature

[source]
----
apoc.graph.filterProperties(value :: ANY?, nodePropertiesToRemove = {} :: MAP? , relPropertiesToRemove = {} :: MAP?) :: ANY?
----

== Output parameters
[.procedures, opts=header]
|===
| Name | Type
|nodes|LIST OF NODE?
|relationships|LIST OF RELATIONSHIP?
|===

The `nodePropertiesToRemove` and `relPropertiesToRemove` parameter are maps
with key the label/relationship type and value the list of properties to remove from the virtual entities.
The key can also be `_all`, for both of them, which means that the properties of each label/rel-type are filtered.

== Usage examples

Given the following dataset:
[source,cypher]
----
CREATE (:Person {name: "foo", plotEmbedding: "11"})-[:REL {idRel: 1, posterEmbedding: "33"}]->(:Movie {name: "bar", plotEmbedding: "22"}),
(:Person {name: "baz", plotEmbedding: "33"})-[:REL {idRel: 1, posterEmbedding: "66"}]->(:Movie {name: "ajeje", plotEmbedding: "44"})
----

we can execute:

[source,cypher]
----
MATCH path=(:Person)-[:REL]->(:Movie)
WITH collect(path) AS paths
CALL apoc.graph.filterProperties(paths, {Movie: ['posterEmbedding'], Person: ['posterEmbedding', 'plotEmbedding', 'plot', 'bio']})
YIELD nodes, relationships
RETURN nodes, relationships
----

.Results
[opts="header",cols="2"]
|===
| nodes | relationships
| [(:Person {name: "1"}), (:Movie {name: "bar"}), (:Movie {title: "1",tmdbId: "ajeje"}), (:Person {name: "baz"}), (:Person {name: "uno"}), (:Movie {name: "ajeje"}), (:Movie {title: "1",tmdbId: "due"}), (:Movie {title: "1",tmdbId: "ajeje"}), (:Person {name: "1"}), (:Movie {title: "1",tmdbId: "ajeje"}), (:Person {name: "foo"}), (:Person {name: "1"})] | [[:REL], [:REL {idRel: 1}], [:REL {idRel: 1}], [:REL], [:REL], [:REL]]│
|===

or:
[source,cypher]
----
MATCH path=(:Person)-[:REL]->(:Movie)
WITH collect(path) AS paths
CALL apoc.graph.filterProperties(paths, {_all: ['plotEmbedding', 'posterEmbedding', 'plot', 'bio']})
YIELD nodes, relationships
RETURN nodes, relationships
----

with the same result as above.
15 changes: 15 additions & 0 deletions docs/asciidoc/modules/ROOT/pages/overview/apoc.graph/index.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
= apoc.graph
:description: This section contains reference documentation for the apoc.graph procedures.

[.procedures, opts=header, cols='5a,1a']
|===
| Qualified Name | Type
|xref::overview/apoc.graph/apoc.graph.filterProperties.adoc[apoc.graph.filterProperties icon:book[]]

CALL apoc.graph.filterProperties(anyEntityObject, nodePropertiesToRemove, relPropertiesToRemove) YIELD nodes, relationships - returns a set of virtual nodes and relationships without the properties defined in nodePropertiesToRemove and relPropertiesToRemove
|label:procedure[]
|xref::overview/apoc.graph/apoc.graph.filterPropertiesProcedure.adoc[apoc.graph.filterProperties icon:book[]]

apoc.graph.filterProperties(anyEntityObject, nodePropertiesToRemove, relPropertiesToRemove) - aggregation function which returns an object {node: [virtual nodes], relationships: [virtual relationships]} without the properties defined in nodePropertiesToRemove and relPropertiesToRemove
|label:function[]
|===
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
[[virtual-nodes-and-relationships]]
= Virtual Nodes and Relationships




This section includes:

* xref::overview/apoc.graph/apoc.graph.filterPropertiesProcedure.adoc[apoc.graph.filterProperties (procedure)]
* xref::overview/apoc.graph/apoc.graph.filterProperties.adoc[apoc.graph.filterProperties (aggregation function)]


We can filter some properties of nodes and relationships present in a subgraph using the `apoc.graph.filterProperties` procedure,
or the analogous aggregation function.

For example, if we want to exclude embedding properties created with the

[source,cypher]
----
CALL apoc.ml.openai.embedding(["Test"], "<apiKey>", {}) yield embedding
with embedding
match (start:Start {id: 1}), (end:End {id: 2})
WITH start, end, embedding
CALL db.create.setNodeVectorProperty(start, "embeddingStart", embedding)
CALL db.create.setNodeVectorProperty(end, "embeddingEnd", embedding)
RETURN start, end
----

we would return virtual entities without those properties.

If we return the nodes to Neo4j Browser or Neo4j Bloom we would have the following situations,
where we can se the log embedding properties :

image::/browserBeforeFilter.png[scaledwidth="100%"]

image::/bloomBeforeFilter.png[scaledwidth="100%"]


But if we filter the embedding properties, then the situation would be as follows, easier to read:

image::/browserAfterFilter.png[scaledwidth="100%"]

image::/bloomAfterFilter.png[scaledwidth="100%"]
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ This file is generated by DocsTest, so don't change it!
** xref::overview/apoc.get/index.adoc[]
*** xref::overview/apoc.get/apoc.get.nodes.adoc[]
*** xref::overview/apoc.get/apoc.get.rels.adoc[]
** xref::overview/apoc.graph/index.adoc[]
*** xref::overview/apoc.graph/apoc.graph.filterProperties.adoc[]
*** xref::overview/apoc.graph/apoc.graph.filterPropertiesProcedure.adoc[]
** xref::overview/apoc.import/index.adoc[]
*** xref::overview/apoc.import/apoc.import.arrow.adoc[]
** xref::overview/apoc.load/index.adoc[]
Expand Down
174 changes: 174 additions & 0 deletions extended/src/main/java/apoc/graph/GraphsExtended.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
package apoc.graph;

import apoc.Extended;
import apoc.result.GraphResult;
import apoc.result.VirtualNode;
import apoc.result.VirtualRelationship;
import apoc.util.collection.Iterables;
import org.neo4j.graphdb.Label;
import org.neo4j.graphdb.Node;
import org.neo4j.graphdb.Path;
import org.neo4j.graphdb.Relationship;
import org.neo4j.graphdb.RelationshipType;
import org.neo4j.procedure.Description;
import org.neo4j.procedure.Name;
import org.neo4j.procedure.Procedure;
import org.neo4j.procedure.UserAggregationFunction;
import org.neo4j.procedure.UserAggregationResult;
import org.neo4j.procedure.UserAggregationUpdate;

import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.stream.Stream;

@Extended
public class GraphsExtended {

@Procedure("apoc.graph.filterProperties")
@Description(
"CALL apoc.graph.filterProperties(anyEntityObject, nodePropertiesToRemove, relPropertiesToRemove) YIELD nodes, relationships - returns a set of virtual nodes and relationships without the properties defined in nodePropertiesToRemove and relPropertiesToRemove")
public Stream<GraphResult> fromData(
@Name("value") Object value,
@Name(value = "nodePropertiesToRemove", defaultValue = "{}") Map<String, List<String>> nodePropertiesToRemove,
@Name(value = "relPropertiesToRemove", defaultValue = "{}") Map<String, List<String>> relPropertiesToRemove) {

VirtualGraphExtractor extractor = new VirtualGraphExtractor(nodePropertiesToRemove, relPropertiesToRemove);
extractor.extract(value);
GraphResult result = new GraphResult( extractor.nodes(), extractor.rels() );
return Stream.of(result);
}

@UserAggregationFunction("apoc.graph.filterProperties")
@Description(
"apoc.graph.filterProperties(anyEntityObject, nodePropertiesToRemove, relPropertiesToRemove) - aggregation function which returns an object {node: [virtual nodes], relationships: [virtual relationships]} without the properties defined in nodePropertiesToRemove and relPropertiesToRemove")
public GraphFunction filterProperties() {
return new GraphFunction();
}

public static class GraphFunction {
public static final String NODES = "nodes";
public static final String RELATIONSHIPS = "relationships";

private VirtualGraphExtractor virtualGraphExtractor;

@UserAggregationUpdate
public void filterProperties(
@Name("value") Object value,
@Name(value = "nodePropertiesToRemove", defaultValue = "{}") Map<String, List<String>> nodePropertiesToRemove,
@Name(value = "relPropertiesToRemove", defaultValue = "{}") Map<String, List<String>> relPropertiesToRemove) {

if (virtualGraphExtractor == null) {
virtualGraphExtractor = new VirtualGraphExtractor(nodePropertiesToRemove, relPropertiesToRemove);
}
virtualGraphExtractor.extract(value);
}

@UserAggregationResult
public Object result() {
Collection<Node> nodes = virtualGraphExtractor.nodes();
Collection<Relationship> relationships = virtualGraphExtractor.rels();
return Map.of(
NODES, nodes,
RELATIONSHIPS, relationships
);
}
}

public static class VirtualGraphExtractor {
private static final String ALL_FILTER = "_all";

private final Map<String, Node> nodes;
private final Map<String, Relationship> rels;
private final Map<String, List<String>> nodePropertiesToRemove;
private final Map<String, List<String>> relPropertiesToRemove;

public VirtualGraphExtractor(Map<String, List<String>> nodePropertiesToRemove, Map<String, List<String>> relPropertiesToRemove) {
this.nodes = new HashMap<>();
this.rels = new HashMap<>();
this.nodePropertiesToRemove = nodePropertiesToRemove;
this.relPropertiesToRemove = relPropertiesToRemove;
}

public void extract(Object value) {
if (value == null) {
return;
}
if (value instanceof Node node) {
addVirtualNode(node);

} else if (value instanceof Relationship rel) {
addVirtualRel(rel);

} else if (value instanceof Path path) {
path.nodes().forEach(this::addVirtualNode);
path.relationships().forEach(this::addVirtualRel);

} else if (value instanceof Iterable) {
((Iterable<?>) value).forEach(this::extract);

} else if (value instanceof Map<?,?> map) {
map.values().forEach(this::extract);

} else if (value instanceof Iterator) {
((Iterator<?>) value).forEachRemaining(this::extract);

} else if (value instanceof Object[] array) {
for (Object i : array) {
extract(i);
}
}
}

/**
* We can use the elementId as a unique key for virtual nodes/relations,
* as it is the same as the analogue for real nodes/relations.
*/
private void addVirtualRel(Relationship rel) {
rels.putIfAbsent(rel.getElementId(), createVirtualRel(rel));
}

private void addVirtualNode(Node node) {
nodes.putIfAbsent(node.getElementId(), createVirtualNode(node));
}

private Node createVirtualNode(Node startNode) {
List<String> props = Iterables.asList(startNode.getPropertyKeys());
nodePropertiesToRemove.forEach((k,v) -> {
if (k.equals(ALL_FILTER) || startNode.hasLabel(Label.label(k))) {
props.removeAll(v);
}
});

return new VirtualNode(startNode, props);
}

private Relationship createVirtualRel(Relationship rel) {
Node startNode = rel.getStartNode();
startNode = nodes.putIfAbsent(startNode.getElementId(), createVirtualNode(startNode));

Node endNode = rel.getEndNode();
endNode = nodes.putIfAbsent(endNode.getElementId(), createVirtualNode(endNode));

Map<String, Object> props = rel.getAllProperties();

relPropertiesToRemove.forEach((k,v) -> {
if (k.equals(ALL_FILTER) || rel.isType(RelationshipType.withName(k))) {
v.forEach(props.keySet()::remove);
}
});

return new VirtualRelationship(startNode, endNode, rel.getType(), props);
}

public List<Node> nodes() {
return List.copyOf(nodes.values());
}

public List<Relationship> rels() {
return List.copyOf(rels.values());
}
}
}
2 changes: 2 additions & 0 deletions extended/src/main/resources/extended.txt
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@ apoc.generate.ws
apoc.gephi.add
apoc.get.nodes
apoc.get.rels
apoc.graph.filterProperties
apoc.graph.filterProperties
apoc.import.arrow
apoc.import.parquet
apoc.load.csv
Expand Down
Loading

0 comments on commit 70dc75b

Please sign in to comment.