Skip to content

Commit 94dffe0

Browse files
committed
docs: complete javadoc of ImportExecutionPlan and friends
1 parent 2961780 commit 94dffe0

File tree

2 files changed

+71
-3
lines changed

2 files changed

+71
-3
lines changed

core/src/main/java/org/neo4j/importer/v1/pipeline/ImportExecutionPlan.java

Lines changed: 65 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,71 @@
2828
import org.neo4j.importer.v1.graph.Graphs;
2929

3030
/**
31-
* Represents the entire parallelizable execution plan for an import step graph.
32-
* Tasks are grouped into groups, as a list of independent ImportStepGroup. Each group can be processed
33-
* entirely in parallel with the others.
31+
* {@link ImportExecutionPlan} exposes the graph of {@link ImportStep} to execute in a way that eases import
32+
* parallelization.<br><br>
33+
* The first level of parallelization is {@link org.neo4j.importer.v1.pipeline.ImportExecutionPlan.ImportStepGroup},
34+
* retrieved with {@link ImportExecutionPlan#getGroups()}.
35+
* Each group corresponds to a weakly connected component of the import step graph.<br>
36+
* For instance, the following YAML serialization of {@link org.neo4j.importer.v1.ImportSpecification} (other attributes
37+
* are omitted for brevity):
38+
* <pre><code>
39+
* version: "1"
40+
* sources:
41+
* - name: actors
42+
* - name: films
43+
* targets:
44+
* nodes:
45+
* - source: actors
46+
* name: actor_nodes
47+
* - source: films
48+
* name: film_nodes
49+
* </code></pre>
50+
* <br>
51+
* ... results into 2 groups:<br><br>
52+
* - 1 with the "actors" source and "actor_nodes" node target (converted respectively to {@link SourceStep} and
53+
* {@link NodeTargetStep})<br>
54+
* - 1 with the "films" source and "film_nodes" node target (converted respectively to {@link SourceStep} and
55+
* {@link NodeTargetStep})<br>
56+
* <br>
57+
* These groups can be processed in parallel.
58+
* The import is considered completed when every group's import has completed.<br><br>
59+
* Each {@link org.neo4j.importer.v1.pipeline.ImportExecutionPlan.ImportStepGroup} is made of several
60+
* {@link org.neo4j.importer.v1.pipeline.ImportExecutionPlan.ImportStepStage}, retrieved with
61+
* {@link ImportStepGroup#getStages()}.<br>
62+
* Stages <strong>must</strong> be processed sequentially. In other words, the second stage can not run until the first
63+
* stage has completed, and so on.<br>
64+
* <br>
65+
* Assuming the following YAML serialization of {@link org.neo4j.importer.v1.ImportSpecification} (other attributes are
66+
* omitted for brevity):
67+
* <pre><code>
68+
* version: "1"
69+
* sources:
70+
* - name: actors
71+
* - name: films
72+
* - name: actors_in_films
73+
* targets:
74+
* nodes:
75+
* - source: actors
76+
* name: actor_nodes
77+
* - source: films
78+
* name: film_nodes
79+
* relationships:
80+
* - source: actors_in_films
81+
* name: actor_film_relationships
82+
* start_node_reference: actor_nodes
83+
* end_node_reference: film_nodes
84+
* </code></pre>
85+
* This would result in a single {@link org.neo4j.importer.v1.pipeline.ImportExecutionPlan.ImportStepGroup}
86+
* (every step is linked, directly or indirectly).
87+
* The group is made of at least 3 stages:<br>
88+
* - the first stage includes all the sources<br>
89+
* - the second stage includes all the nodes<br>
90+
* - the last stage includes the relationship<br>
91+
* <br>
92+
* Finally, each stage is made of several steps.
93+
* These steps (either {@link SourceStep}, {@link NodeTargetStep}, {@link RelationshipTargetStep},
94+
* {@link CustomQueryTargetStep} or {@link ActionStep}) can be processed in parallel.<br>
95+
* The enclosing stage execution is considered complete when all its steps have completed.
3496
*/
3597
public class ImportExecutionPlan {
3698

core/src/main/java/org/neo4j/importer/v1/pipeline/ImportPipeline.java

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,12 @@ public Iterator<ImportStep> iterator() {
121121
return stepGraph.keySet().iterator();
122122
}
123123

124+
/**
125+
* Returns an {@link ImportExecutionPlan}, which makes it easier to write parallelizable import, compared to the
126+
* sequential {@link Iterable} API that {@link ImportPipeline} exposes.<br>
127+
* Please consult the documentation of {@link ImportExecutionPlan} for more details.
128+
* @return the import execution plan
129+
*/
124130
public ImportExecutionPlan executionPlan() {
125131
return ImportExecutionPlan.fromGraph(stepGraph);
126132
}

0 commit comments

Comments
 (0)