Releases · neo4j/graph-data-science

27 Apr 10:34

gminneci

2.3.4

043e0b3

2.3.4

Bug fixes

gds.beta.pipeline.linkPrediction.train sampled relationships now only contain valid node ids and will avoid ArrayIndexOutOfBoundException during training.

Assets 4

21 Apr 13:10

gminneci

2.3.3

fe5d867

Graph Data Science 2.3.3

New features

Neo4j Database Compatibility

This release is compatible with all Neo4j 5.x database version <= 5.7.0. Please see our compatibility matrix above.
Added includeGraphs parameter to gds.alpha.backup to allow backups without graphs.

Bug fixes

Multiclass node classification compatible with non-consecutive class ids
RandomWalk stable on multiple runs (user contribution by github user hindog)

Improvements

Make gds.alpha.restore more failsafe
- Continue to restore graphs and models also after the first failure for a user.
- Improve logging around failures

Full Changelog: 2.3.2...2.3.3

Assets 4

11 Apr 10:18

Mats-SX

2.3.2

ca061d5

Graph Data Science 2.3.2

GDS 2.3.2 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9).

For GDS compatibility with previous releases, please use GDS Compatibility Table.

New features

Neo4j Database Compatibility

This release is compatible with all Neo4j 5.x database version <= 5.6.0. Please see our compatibility matrix above.

Bug fixes

Graphs imported via Arrow no longer cause invalid node mappings that produced ArrayIndexOutOfBoundsExceptions
Correct memory estimation of Leiden for very small graphs
KNN no longer result in an AIOOB exception if the array node properties did not exist for some nodes
CELF no longer returns negative gains for some nodes
GraphSage will no longer return NaN values because of incorrect neighbor sampling

Improvements

More accurate memory estimation on Node Similarity and filtered Node Similarity algorithms for high topN or topK values.
The gds.alpha.modularity procedures for computing modularity no longer require each community to be smaller than the size of the graph.
Improve the progress logging of gds.graph.project.cypher to be more accurate. Especially, this avoids underestimating when the relationship query is more complex.

Assets 4

16 Feb 15:42

laeg

2.3.1

412ed2a

Graph Data Science 2.3.1

GDS 2.3.1 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

New features

Neo4j Database Compatibility

This release is compatible with all Neo4j 5.x database version <= 5.5.0. Please see our compatibility matrix above.

Log Progress

New optional configuration parameter logProgress allows you to specify whether percentage logging for that procedural call is on or off.

Bug fixes

Louvain no longer reports the incorrect modularity
Leiden on weighted graphs communities are now reported correctly
Persisted Models no longer cause false positive error logs when loaded into the Model Catalog
Yens on graphs without parallel relationships would cause issues

Improvements

Filtered Node Similarity progress logging has been improved

Assets 4

01 Feb 08:31

laeg

2.3.0

2c4c5ed

Graph Data Science 2.3.0

GDS 2.3.0 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Breaking changes

Leiden was promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the gds.alpha.leiden command.
K-means was promoted to the beta tier. It is now called via the gds.beta.kmeans command instead of the gds.alpha.kmeans command.
Minimum weighted spanning tree algorithm was promoted to the beta tier. It is now called via the gds.beta.spanningTree command instead of gds.alpha.spanningTree
- The procedures gds.alpha.spanningTree.minimum and gds.alpha.spanningTree.maximum have been removed. You can get the same behaviour by specifying the new parameter objective in gds.beta.spanningTree.
- The weightWriteProperty has been removed as a configuration parameter. To supply the Relationship Type and Property for the produced relationship, use:
  - mutateRelationshipType
  - mutateProperty
- gds.alpha.spanningTree.kmin and gds.alpha.spanningTree.kmax have been removed as the K-Spanning Tree algorithm has been moved in its own space gds.alpha.kSpanningTree
- The parameter startNodeId in all Spanning Tree algorithms has been replaced with sourceNode.
Arrow: when projecting graphs, null will be translated to NaN for floating point values. This enables users of either the GDS Python Client or PyArrow to load NaN properties stored in Pandas DataFrames
Cypher Aggregations will become the primary surface for creating projections with Cypher. Offering a more intuitive and expressive interface than Cypher Projections that can also be used in Fabric or Composite Database setups.
The algorithm gds.alpha.influenceMaximization.greedy has been removed. It's replacement is the gds.beta.influenceMaximization.celf algorithm which has the same configuration parameters and offers better performance.

New features

Neo4j Database Compatibility

This release is compatible with all Neo4j 5.x database version <= 5.4.0. Please see compatibility matrix above.

Minimum Directed Steiner Tree

Added heuristic for minimum directed Steiner Tree under the gds.beta.steinerTree domain.
- Added stats mode with gds.beta.steinerTree.stats
- Added stream mode with gds.beta.steinerTree.stream
- Added mutate mode with gds.beta.steinerTree.mutate
- Added write mode with gds.beta.steinerTree.write
- Now available in progress tracking - gds.list.progress()

Leiden

New parameter consecutiveIds that assigns consecutive ids for the discovered communities.
New parameter seedProperty to seed initial communities for nodes.
New parameter tolerance to enable convergence criteria based on differences in modularity from one iteration to another.
Now available in progress tracking - gds.list.progress()
Added memory estimation mode:
- gds.beta.leiden.mutate.estimate
- gds.beta.leiden.stats.estimate
- gds.beta.leiden.stream.estimate
- gds.beta.leiden.write.estimate

Logistic Regression & MLP

New configuration parameters classWeights and focusWeight for training methods, supported by procedures:
- gds.beta.pipeline.nodeClassification.addLogisticRegression
- gds.beta.pipeline.nodeClassification.addMLP
- gds.beta.pipeline.linkPrediction.addLogisticRegression
- gds.beta.pipeline.linkPrediction.addMLP

HashGNN

New algorithm gds.alpha.hashgnn.{mutate,stream} to create HashGNN node embeddings
New estimation procedures gds.alpha.hashgnn.{mutate,stream}.estimate to estimate the memory required to run HashGNN

Link Prediction

Added new optional configuration parameter negativeRelationshipType to gds.beta.pipeline.linkPrediction.configureSplit

Spanning Tree

New modes supported: gds.beta.spanningTree.{stats, stream, mutate}
New yield outputs for gds.beta.spanningTree:
- the sum of weights in the discovered spanning tree.
- the number of relationships written or added for write and mutate mode respectively.
Added memory estimation mode :
gds.beta.spanningTree.stream.estimate
gds.beta.spanningTree.mutate.estimate
gds.beta.spanningTree.stats.estimate
gds.beta.spanningTree.write.estimate

Write Labels

gds.alpha.graph.nodeLabel.mutate allows for the Graph Projection to be mutated with new labels
gds.alpha.graph.nodeLabel.write allows for Node Labels to be written back from projections to a Neo4j Database

Graph Projections

Arrow now supports specifying undirected relationship types using the undirected_relationship_types configuration argument
Cypher Aggregations (gds.alpha.graph.project) now support specifying undirected relationship types using the undirectedRelationshipTypes configuration option
New procedure to turn directed relationships into undirected relationships: gds.beta.graph.relationships.toUndirected
Projections created using either the Native, Arrow and Cypher Aggregation APIs can now be "inverse indexed", this will enable more efficient algorithm implementations

Administration

Added the jobId and username to the ongoingGdsProcedures return field of gds.alpha.systemMonitor.
Added username as a new return field to gds.beta.listProgress.
Added a new return field to gds.graph.list called schemaWithOrientation which also includes the orientation.
Administrators can now see all running tasks from all users with gds.beta.listProgress

Bug fixes

Minimum Weighted Spanning Tree: Graphs with parallel edges could make the discovered tree have wrong weights on relationships
Cypher Aggregations: When using gds.alpha.graph.project:
- The projected graph would list relationship types with zero relationships
- AIOOB exceptions could surface due to sizing errors
Arrow: CREATE_DATABASE action would throw an NullPointerException if missing ID fields in the Arrow record. A more descriptive exception is provided
gds.graph.list could cause issues on some JDKs when calculating the memory usage of Projections
Export relationship progress logging (gds.beta.graph.export.csv) reports the correct progress
Graph constructed with Cypher Aggregation using arbitrary IDs are now blocked from write procedures
The k-Spanning Tree algorithm no longer returns disconnected partitions
Multi-threading bug when creating projections via Cypher Aggregation or Arrow could lead to lost labels
Node label filtering could lead to streamed node properties being null when filters are applied
Cypher projections and Cypher aggregation would throw the wrong error message when loading an invalid relationship
Node label filtering that would lead to the wrong results. This also affected: gds.beta.graphSage and gds.beta.graph.relationships.stream

Improvements

Arrow

graph import now fully supports external node ids in the 64 Bit space.
graph import now supports 16, 32 or 64 Bit node identifiers.
Arrow server will now check user RBAC permissions for creating and accessing databases
Database import now creates a Relationship Type index

Leiden

Better parallelization and improved overall performance improvements

WCC

Now supports a new and faster sampling strategy ( undirected and directed graphs) by using the new inverse index.

Machine Learning

Inner components of pipeline field returned by gds.pipeline.{ linkPrediction | nodeClassification | nodeRegression }.train procedures are now present directly as part of modelInfo. The pipeline field is now deprecated for removal in a future version.

Other Improvements

Speed improvements for Dijkstra, Astar, Yens, CELF, weighted Betweenness Centrality, and the Spanning Tree algorithms. The improvements will see a slight increase in the memory consumption of these algorithms.
Improved error message for invalid node labels and relationship types
Pregel now supports bidirectional computations (allows for messages to be sent along incoming relationships) using the new inverse index.
The procedure gds.graph.export now creates a Relationship Type index
Extended node property validation to reject projection configuration mappings with the same property keys, but different default values.

Other changes

Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.
Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.
Mark the yielded field schema as deprecated in gds.graph.list and gds.graph.drop. In the next major release, the schema field will use the semantics of schemaWithOrientation
In gds.alpha.model.store, the positional argument failIfUnsupportedType is renamed to failIfUnsupported. Both will be supported until it is promoted to the beta tier.
Progress tracking for Betweenness Centrality has been reworked. Progress reporting may differ from earlier versions.

Pre-release changes

The Steiner Tree procedures in gds.beta.SteinerTree was originally introduced as gds.alpha.SteinerTree. The update in naming occurred in 2.3.0-alpha04.

Assets 4

27 Jan 09:49

laeg

2.2.7

e218fb4

Graph Data Science 2.2.7

GDS 2.2.7 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

New features

Added compatibility for Neo4j database 5.4.0.

Bug fixes

Missing id fields in the Arrow records for the CREATE_DATABASE action would throw a NullPointerException. It now throws a more descriptive exception instead.
Graphs with long node or relationship property names would fail during the restore process.
Yens algorithm would ignore edges in multigraphs and yield incorrect results.
Multi-threading bug when creating projections via Cypher Aggregation or Arrow could lead to lost labels.
Node label filtering could lead to streamed node properties being null when filters are applied.
Cypher projections and Cypher aggregation would throw the wrong error message when loading an invalid relationship.
Node label filtering that would lead to the wrong results. This also affected: gds.beta.graphSage and gds.beta.graph.relationships.stream.

Assets 4

05 Jan 15:58

laeg

2.3.0-alpha04

355178d

Graph Data Science 2.3.0-Alpha04 Pre-release

Pre-release

GDS 2.3.0-alpha04 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Breaking changes

Leiden was promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the gds.alpha.leiden command.
K-means was promoted to the beta tier. It is now called via the gds.beta.kmeans command instead of the gds.alpha.kmeans command.
Minimum weighted spanning tree algorithm was promoted to the beta tier. It is now called via the gds.beta.spanningTree command instead of gds.alpha.spanningTree
- The procedures gds.alpha.spanningTree.minimum and gds.alpha.spanningTree.maximum have been removed. You can get the same behaviour by specifying the new parameter objective in gds.beta.spanningTree.
- The weightWriteProperty has been removed as a configuration parameter. To supply the Relationship Type and Property for the produced relationship, use:
  - mutateRelationshipType
  - mutateProperty
- gds.alpha.spanningTree.kmin and gds.alpha.spanningTree.kmax have been removed as the K-Spanning Tree algorithm has been moved in its own space gds.alpha.kSpanningTree
- The parameter startNodeId in all Spanning Tree algorithms has been replaced with sourceNode.
Arrow: when projecting graphs, null will be translated to NaN for floating point values. This enables users of either the GDS Python Client or PyArrow to load NaN properties stored in Pandas DataFrames
Cypher Aggregations will become the primary surface for creating projections with Cypher. Offering a more intuitive and expressive interface than Cypher Projections that can also be used in Fabric or Composite Database setups.
The algorithm gds.alpha.influenceMaximization.greedy has been removed. It's replacement is the already existing gds.beta.influenceMaximization.celf algorithm which has the same configuration parameters and offers better performance.

New features

Minimum Directed Steiner Tree

Added heuristic for minimum directed Steiner Tree under the gds.beta.steinerTree domain.
- Added stats mode with gds.beta.steinerTree.stats
- Added stream mode with gds.beta.steinerTree.stream
- Added mutate mode with gds.beta.steinerTree.mutate
- Added write mode with gds.beta.steinerTree.write
- Now available in progress tracking - gds.list.progress()

Leiden

New parameter consecutiveIds that assigns consecutive ids for the discovered communities.
New parameter seedProperty to seed initial communities for nodes.
New parameter tolerance to enable convergence criteria based on difference in modularity from one iteration to another.
Now available in progress tracking - gds.list.progress()
Added memory estimation mode:
- gds.beta.leiden.mutate.estimate
- gds.beta.leiden.stats.estimate
- gds.beta.leiden.stream.estimate
- gds.beta.leiden.write.estimate

Logistic Regression & MLP

New configuration parameters classWeights and focusWeight for training methods, supported by procedures:
- gds.beta.pipeline.nodeClassification.addLogisticRegression
- gds.beta.pipeline.nodeClassification.addMLP
- gds.beta.pipeline.linkPrediction.addLogisticRegression
- gds.beta.pipeline.linkPrediction.addMLP

HashGNN

New algorithm gds.alpha.hashgnn.{mutate,stream} to create HashGNN node embeddings
New procedures gds.alpha.hashgnn.{mutate,stream}.estimate to estimate the memory required to run HashGNN

Link Prediction

Added new optional configuration parameter negativeRelationshipType to gds.beta.pipeline.linkPrediction.configureSplit

Spanning Tree

New modes supported: gds.beta.spanningTree.(stats, stream, mutate)
New yield output for gds.beta.spanningTree that outputs the sum of weights in the discovered spanning tree.
New yield output for gds.beta.spanningTree that outputs the number of relationships written or added for write and mutate mode respectively.
Added memory estimation mode :
gds.beta.spanningTree.stream.estimate
gds.beta.spanningTree.mutate.estimate
gds.beta.spanningTree.stats.estimate
gds.beta.spanningTree.write.estimate

Write Labels

Added gds.alpha.graph.nodeLabel.write to allow for Node Labels to be written back from projections to a Neo4j Database

Graph Projections

Arrow now supports specifying undirected relationship types using the undirected_relationship_types configuration argument
Cypher Aggregations (gds.alpha.graph.project) now support specifying undirected relationship types using the undirectedRelationshipTypes configuration option
New procedure to turn directed relationships into undirected relationships: gds.beta.graph.relationships.toUndirect

Administration

Added the jobId and username to the ongoingGdsProcedures return field of gds.alpha.systemMonitor.
Added username as a new return field to gds.beta.listProgress.
Added a new return field to gds.graph.list called schemaWithOrientation which also includes the orientation.
Administrators can now see all running tasks from all users with gds.beta.listProgress

Bug fixes

Minimum Weighted Spanning Tree: Graphs with parallel edges could make the discovered tree have wrong weights on relationships
Cypher Aggregations: When using gds.alpha.graph.project:
- The projected graph would list relationship types with zero relationships
- AIOOB exceptions could surface due to sizing errors
Arrow: CREATE_DATABASE action would throw a NPE if missing id fields in Arrow record.. A more descriptive exception is provided

Improvements

Arrow

graph import now fully supports external node ids in the 64 Bit space.
graph import now supports 16, 32 or 64 Bit node identifiers.

Leiden

Better parallelization and improved overall performance improvements

Other Improvements

Speed improvements for Dijkstra, Astar, Yens, CELF, weighted Betweenness Centrality, and the Spanning Tree algorithms. The improvements will see a slight increase in the memory consumption of these algorithms.
Improved error message for invalid node labels and relationship types

Other changes

Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.
Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.
Mark the yielded field schema as deprecated in gds.graph.list and gds.graph.drop. In the next major release, the schema field will use the semantics of schemaWithOrientation
In gds.alpha.model.store, the positional argument failIfUnsupportedType is renamed to failIfUnsupported. Both will be supported until it is promoted to the beta tier.
Progress tracking for Betweenness Centrality has been reworked. Progress reporting may differ from earlier versions.

Assets 4

16 Dec 09:15

laeg

2.2.6

3f5e0fa

Graph Data Science 2.2.6

Neo4j Graph Data Science 2.2.6 is compatible with Neo4j Database 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For Neo4j Graph Data Science compatibility, please use the Neo4j Compatibility Matrix.

Improvements

Added support for Neo4j Database 5.3

Assets 4

01 Dec 15:45

laeg

2.3.0-alpha03

7a08fd9

Graph Data Science 2.3.0-alpha03 Pre-release

Pre-release

GDS 2.3.0-alpha03 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Breaking changes

Leiden promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the gds.alpha.leiden command.
K-means is promoted to the beta tier. It is now called via the gds.beta.kmeans command instead of the gds.alpha.kmeans command.
The parameter startNodeId in Spanning Tree algorithms have been replaced with sourceNode.
The minimum weighted spanning tree algorithm is moved to beta. It is now called via the gds.beta.spanningTree command instead of gds.alpha.spanningTree
- The procedures gds.alpha.spanningTree.minimum and gds.alpha.spanningTree.maximum have been removed. You can get the same behavior by specifying the new parameter objective in gds.beta.spanningTree.

New features

Minimum Directed Steiner Tree

Added heuristic for minimum directed Steiner Tree under the gds.alpha.steinerTree domain.
- Added stats mode with gds.alpha.steinerTree.stats
- Added stream mode with gds.alpha.steinerTree.stream
- Added mutate mode with gds.alpha.steinerTree.mutate
- Added write mode with gds.alpha.steinerTree.write

Leiden

New parameter consecutiveIds that assigns consecutive ids for the discovered communities.
New parameter seedProperty to seed initial communities for nodes.
New parameter tolerance to enable convergence criteria based on difference in modularity from one iteration to another.
Now available in progress tracking - gds.list.progress()
Added memory estimation mode:
- gds.beta.leiden.mutate.estimate
- gds.beta.leiden.stats.estimate
- gds.beta.leiden.stream.estimate
- gds.beta.leiden.write.estimate

Logistic Regression & MLP

New configuration parameters classWeights and focusWeight for training methods, supported by procedures:
- gds.beta.pipeline.nodeClassification.addLogisticRegression
- gds.beta.pipeline.nodeClassification.addMLP
- gds.beta.pipeline.linkPrediction.addLogisticRegression
- gds.beta.pipeline.linkPrediction.addMLP

HashGNN

New algorithm gds.alpha.hashgnn.{mutate,stream} to create HashGNN node embeddings
New procedures gds.alpha.hashgnn.{mutate,stream}.estimate to estimate the memory required to run HashGNN

Link Prediction

Added new optional configuration parameter negativeRelationshipType to gds.beta.pipeline.linkPrediction.configureSplit

Spanning Tree

New modes supported: gds.alpha.spanningTree.(stats, stream, mutate)
New yield output for gds.alpha.spanningTree that outputs the sum of weights in the discovered spanning tree.
New yield output for gds.alpha.spanningTree that outputs the number of relationships written or added for write and mutate mode respectively.
Added memory estimation mode :
gds.alpha.spanningTree.stream.estimate
gds.alpha.spanningTree.mutate.estimate
gds.alpha.spanningTree.stats.estimate
gds.alpha.spanningTree.write.estimate

Write Labels

Added gds.alpha.graph.nodeLabel.write to allow for Node Labels to be written back from projections to a Neo4j Database

Administration

Added the jobId and username to the ongoingGdsProcedures return field of gds.alpha.systemMonitor.
Added username as a new return field to gds.beta.listProgress.
Added a new return field to gds.graph.list called schemaWithOrientation which also includes the orientation.

Bug fixes

Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.

Improvements

Arrow

graph import now fully supports external node ids in the 64 Bit space.
graph import now supports 16, 32 or 64 Bit node identifiers.

Leiden

Better parallelization and improved overall performance improvements

Other Algorithms

Speed improvements for Dijkstra, Astar, Yens, CELF, weighted Betweenness Centrality, and the Spanning Tree algorithms. The improvements will see a slight increase in the memory consumption of these algorithms.

Other changes

Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.
Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.
Mark the yielded field schema as deprecated in gds.graph.list and gds.graph.drop. In the next major release, the schema field will use the semantics of schemaWithOrientation

Assets 4

29 Nov 10:39

laeg

2.2.5

dcb851c

Graph Data Science 2.2.5

Neo4j Graph Data Science 2.2.5 is compatible with Neo4j Database 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For Neo4j Graph Data Science compatibility, please use the Neo4j Compatibility Matrix.

Bug Fixes

Some functions would not work as expected with Neo4j 5.x versions
- gds.alpha.linkprediction.adamicAdar
- gds.alpha.linkprediction.commonNeighbors
- gds.alpha.linkprediction.resourceAllocation
- gds.alpha.linkprediction.totalNeighbors

Assets 4

Releases: neo4j/graph-data-science

2.3.4

Bug fixes

Uh oh!

Graph Data Science 2.3.3

New features

Bug fixes

Improvements

Uh oh!

Graph Data Science 2.3.2

New features

Bug fixes

Improvements

Uh oh!

Graph Data Science 2.3.1

New features

Bug fixes

Improvements

Uh oh!

Graph Data Science 2.3.0

Breaking changes

New features

Bug fixes

Improvements

Other changes

Pre-release changes

Uh oh!

Graph Data Science 2.2.7

New features

Bug fixes

Uh oh!

Graph Data Science 2.3.0-Alpha04

Breaking changes

New features

Bug fixes

Improvements

Other changes

Uh oh!

Graph Data Science 2.2.6

Improvements

Uh oh!

Graph Data Science 2.3.0-alpha03

Breaking changes

New features

Bug fixes

Improvements

Other changes

Uh oh!

Graph Data Science 2.2.5

Bug Fixes

Uh oh!