|
28 | 28 | import org.neo4j.importer.v1.graph.Graphs; |
29 | 29 |
|
30 | 30 | /** |
31 | | - * Represents the entire parallelizable execution plan for an import step graph. |
32 | | - * Tasks are grouped into groups, as a list of independent ImportStepGroup. Each group can be processed |
33 | | - * entirely in parallel with the others. |
| 31 | + * {@link ImportExecutionPlan} exposes the graph of {@link ImportStep} to execute in a way that eases import |
| 32 | + * parallelization.<br><br> |
| 33 | + * The first level of parallelization is {@link org.neo4j.importer.v1.pipeline.ImportExecutionPlan.ImportStepGroup}, |
| 34 | + * retrieved with {@link ImportExecutionPlan#getGroups()}. |
| 35 | + * Each group corresponds to a weakly connected component of the import step graph.<br> |
| 36 | + * For instance, the following YAML serialization of {@link org.neo4j.importer.v1.ImportSpecification} (other attributes |
| 37 | + * are omitted for brevity): |
| 38 | + * <pre><code> |
| 39 | + * version: "1" |
| 40 | + * sources: |
| 41 | + * - name: actors |
| 42 | + * - name: films |
| 43 | + * targets: |
| 44 | + * nodes: |
| 45 | + * - source: actors |
| 46 | + * name: actor_nodes |
| 47 | + * - source: films |
| 48 | + * name: film_nodes |
| 49 | + * </code></pre> |
| 50 | + * <br> |
| 51 | + * ... results into 2 groups:<br><br> |
| 52 | + * - 1 with the "actors" source and "actor_nodes" node target (converted respectively to {@link SourceStep} and |
| 53 | + * {@link NodeTargetStep})<br> |
| 54 | + * - 1 with the "films" source and "film_nodes" node target (converted respectively to {@link SourceStep} and |
| 55 | + * {@link NodeTargetStep})<br> |
| 56 | + * <br> |
| 57 | + * These groups can be processed in parallel. |
| 58 | + * The import is considered completed when every group's import has completed.<br><br> |
| 59 | + * Each {@link org.neo4j.importer.v1.pipeline.ImportExecutionPlan.ImportStepGroup} is made of several |
| 60 | + * {@link org.neo4j.importer.v1.pipeline.ImportExecutionPlan.ImportStepStage}, retrieved with |
| 61 | + * {@link ImportStepGroup#getStages()}.<br> |
| 62 | + * Stages <strong>must</strong> be processed sequentially. In other words, the second stage can not run until the first |
| 63 | + * stage has completed, and so on.<br> |
| 64 | + * <br> |
| 65 | + * Assuming the following YAML serialization of {@link org.neo4j.importer.v1.ImportSpecification} (other attributes are |
| 66 | + * omitted for brevity): |
| 67 | + * <pre><code> |
| 68 | + * version: "1" |
| 69 | + * sources: |
| 70 | + * - name: actors |
| 71 | + * - name: films |
| 72 | + * - name: actors_in_films |
| 73 | + * targets: |
| 74 | + * nodes: |
| 75 | + * - source: actors |
| 76 | + * name: actor_nodes |
| 77 | + * - source: films |
| 78 | + * name: film_nodes |
| 79 | + * relationships: |
| 80 | + * - source: actors_in_films |
| 81 | + * name: actor_film_relationships |
| 82 | + * start_node_reference: actor_nodes |
| 83 | + * end_node_reference: film_nodes |
| 84 | + * </code></pre> |
| 85 | + * This would result in a single {@link org.neo4j.importer.v1.pipeline.ImportExecutionPlan.ImportStepGroup} |
| 86 | + * (every step is linked, directly or indirectly). |
| 87 | + * The group is made of at least 3 stages:<br> |
| 88 | + * - the first stage includes all the sources<br> |
| 89 | + * - the second stage includes all the nodes<br> |
| 90 | + * - the last stage includes the relationship<br> |
| 91 | + * <br> |
| 92 | + * Finally, each stage is made of several steps. |
| 93 | + * These steps (either {@link SourceStep}, {@link NodeTargetStep}, {@link RelationshipTargetStep}, |
| 94 | + * {@link CustomQueryTargetStep} or {@link ActionStep}) can be processed in parallel.<br> |
| 95 | + * The enclosing stage execution is considered complete when all its steps have completed. |
34 | 96 | */ |
35 | 97 | public class ImportExecutionPlan { |
36 | 98 |
|
|
0 commit comments