Releases: neo4j/graph-data-science
Graph Data Science 2.2.4
GDS 2.2.4 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
Bug fixes
gds.alpha.nodeSimilarity.filtered- would give incorrect node IDs.- Pregel framework - the computation would not stop after terminating the underlying transaction. This affects
gds.pageRank,gds.articleRank,gds.eigenvector. alpha.hitsandgds.alpha.sllpacould not be used as a nodeProperty step inside ml pipeline includinggds.beta.pipeline.linkPrediction,gds.beta.pipeline.nodeClassification, andgds.alpha.pipeline.nodeRegression.- nodeProperty steps could not be added to ml pipelines when running against Neo4j 5.x. This affected
gds.beta.pipeline.linkPrediction,gds.beta.pipeline.nodeClassification, andgds.alpha.pipeline.nodeRegression.
Improvements
gds.graph.listwill only calculate the graph size when the procedure is called without anyYIELDor if the fieldsmemoryUsageorsizeInBytesare explicitlyYIELD-ed.
UsingYIELDto return other fields but not one ofmemoryUsageorsizeInBytescan speed up the execution time ofgds.graph.list.
Graph Data Science 2.2.3
GDS 2.2.3 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
Bug fixes
gds.graph.exportfailed to run on Neo4j 5.Xgds.graph.exportfailed with InvalidRecordException whenwriteConcurrencyis set >1.- Enterprise users were unable to load models trained with concurrency > 4.
Improvements
- Arrow graph import now fully supports external node ids in the 64 Bit space.
Graph Data Science 2.3.0-alpha01
GDS 2.2.2 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
New features
- Added a parameter
consecutiveIdsto Leiden to assign consecutive ids for the discovered communities. - Added a parameter
seedPropertyto Leiden to seed initial communities for nodes. - Added new configuration parameter
focusWeightfor Logistic Regression training method, supported by procedures:gds.beta.pipeline.nodeClassification.addLogisticRegressiongds.beta.pipeline.linkPrediction.addLogisticRegression
Bug fixes
- Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.
Improvements
- Arrow graph import now fully supports external node ids in the 64 Bit space.
- Arrow graph import now supports 16, 32 or 64 Bit node identifiers.
Other changes
- Histograms returned such as
degreeDistributioningds.graph.listcan have slightly different values for specific percentiles due to changes in floating point operations.
Graph Data Science 2.2.2
GDS 2.2.2 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
Improvements
- Graph Data Science ≥2.2.2 now supports Neo4j 5
Graph Data Science 2.2.1
GDS 2.2.1 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Breaking changes
- Change the content of some fields from the output of
gds.debug.arrow:-
listenAddressnow always returns the same content asadvertisedListenAddress -
serverLocationalways returnsNULL
-
Graph Data Science 2.2.0
GDS 2.2.0 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Breaking changes
- Link Prediction filtering:
- Change graph filtering in
gds.beta.pipeline.linkPrediction.train- Replace parameter
nodeLabelswithsourceNodeLabelandtargetNodeLabel. - Replace parameter
relationshipTypeswithtargetRelationshipType.
- Replace parameter
- Change graph filtering in
gds.beta.pipeline.linkPrediction.predict- Replace parameter
nodeLabelswith optionalsourceNodeLabelsandtargetNodeLabels. By default, they will be derived from the model's train configuration. - Change the default value for
relationshipTypeswith thetargetRelationshipTypefrom the model's train configuration.
- Replace parameter
- Change graph filtering in
- Node Classification & Regression filtering:
- Change graph filtering in
gds.beta.pipeline.nodeClassification.trainandgds.beta.pipeline.nodeRegression.train- Replace parameter
nodeLabelswithtargetNodeLabels
- Replace parameter
- Change graph filtering in
gds.beta.pipeline.nodeClassification.predictandgds.beta.pipeline.nodeRegression.predict- Replace parameter
nodeLabelswithtargetNodeLabelsBy default, they will be derived from the model's train configuration.
- Replace parameter
- Change graph filtering in
- Promoting Collapse Path to beta tier
- Changed the procedure name to
gds.beta.collapsePath.mutate - Use parameter
pathTemplatesto now specify multiple_path templates_.
- Changed the procedure name to
- Promoting CELF to
betatier- Moved
gds.alpha.influenceMaximization.celf.streamtogds.beta.influenceMaximization.celf.stream
- Moved
- For graphs created, with
gds.graph.project.cypher, reduce output ofgds.graph.listto only print the names ofparameters. This will avoid printing the parameter values, which potentially leads to long procedure execution times. - RandomWalk algorithm promoted to product tier
gds.beta.randomWalk.stats=>gds.randomWalk.statsgds.beta.randomWalk.stats.estimate=>gds.randomWalk.stats.estimategds.beta.randomWalk.stream=>gds.randomWalk.streamgds.beta.randomWalk.stream.estimate=>gds.randomWalk.stream.estimate
- Removed
debug_logconfig field from Arrow Create Database action. - Node2Vec uses new embedding initializer
NORMALIZEDas default. - Dropped support for older patches:
- for 4.3, only 4.3.15 and later is supported
- for 4.4, only 4.4.9 and later is supported
New features
- Link Prediction filtering:
- Supports heterogeneous LinkPrediction pipelines by allowing configuring which node labels and relationship type to train and predict for.
- See Breaking changes above for more details.
- K-means:
- Added centroids and average node-centroid distance to result for Mutate, Stats, and Write modes.
- Added distance to centroid per node result in Stream mode.
- Introduced a parameter
numberOfRestartsthat runs K-Means multiple times and picks the one with the minimum node-centroid distance. - Introduced a parameter
computeSilhouettethat if enabled will compute silhouette related metrics. - Introduced a parameter
initialSamplerwhich can select different sampling strategies for picking the first centroids.- Added the
K-means++initialization algorithm which can be enabled by settinginitialSampler=kmeans++.
- Added the
- Introduced the parameter
seedCentroidswhich seeds input centroids to k-means (in negation of the above).
- Introduced a new scaler
CenterforScalePropertiesthat subtracts the mean from each value. - Expose
penaltyL2to configure the l2 regularization term to the loss function ingds.beta.graphSage.train. - Add Multilayer Perceptron as a training method for node classification (
gds.alpha.pipeline.nodeClassification.addMLP) and link prediction (gds.alpha.pipeline.linkPrediction.addMLP). - Add
SAME_CATEGORYfeature type togds.beta.pipeline.linkPrediction.addFeature. - Added new procedure
gds.beta.graph.relationships.streamthat streams relationship topology. - Added arrow export endpoint
gds.beta.graph.relationships.streamthat streams relationship topology. - Added new procedure
gds.alpha.graph.sample.rwrthat creates a new graph projection by sampling using random walk with restarts. - Added the ability to collapse multiple paths using
gds.beta.collapsePath.mutate. - Promoting CELF algorithm to
betatier.- Added
gds.beta.influenceMaximization.celf.stats - Added
gds.beta.influenceMaximization.celf.mutate - Added
gds.beta.influenceMaximization.celf.write - Added progress tracking capabilities.
- Added memory estimation.
- Added
- Progress tracking for KMeans algorithm.
- Memory estimation for KMeans.
- added
gds.alpha.kmeans.mutate.estimate - added
gds.alpha.kmeans.stats.estimate - added
gds.alpha.kmeans.stream.estimate - added
gds.alpha.kmeans.write.estimate
- added
- Added procedure to compute modularity for pre-computed communities.
gds.alpha.modularity.statsgds.alpha.modularity.stream
- Added new config options to the GDS Flight server.
gds.arrow.encryption.neverdeactivates the server encryption even if it would otherwise be enabled.gds.arrow.advertised_listen_addresssets the server location that clients should connect to.
- Added support for importing
Stringnode identifiers for the ArrowCREATE_DATABASEaction. - Added capability to run BetweennessCentrality using relationship weights.
- Added
relationshipWeightPropertyoptional configuration parameter.
- Added
- Added
statsmode procedures for RandomWalk.gds.beta.randomWalk.statsgds.beta.randomWalk.stats.estimate
- Introduced the ability to configure defaults and limits for configuration parameters.
gds.alpha.config.defaults.listgds.alpha.config.defaults.setgds.alpha.config.limits.listgds.alpha.config.limits.set
- Introduce new configuration parameters
contextNodeLabelsandcontextRelationshipTypesin nodePropertySteps.gds.beta.pipeline.linkPrediction.addNodePropertygds.beta.pipeline.nodeClassification.addNodePropertygds.alpha.pipeline.nodeRegression.addNodeProperty- The context is used to enlarge the input graph to the node property steps when running
gds.beta.pipeline.linkPrediction.addNodeProperty.[train|predict],gds.beta.pipeline.nodeClassification.[train|predict]andgds.alpha.pipeline.nodeRegression.[train|predict].
Leiden- Add capability to mutate
intermediateCommunitieswhenincludeIntermediateCommunitiesis set totrue. - Add capability to write
intermediateCommunitieswhenincludeIntermediateCommunitiesis set totrue.
- Add capability to mutate
- Node2Vec adds new embedding initializer
NORMALIZEDconfigured with the parameterembeddingInitializer.
Bug fixes
- Fixed a bug where eager checking for business rules around GDS on a Neo4j cluster would cause the cluster to fail to start.
- Fixed a bug where Neo4j users with
adminrole could not see all graphs in the catalog on GDS enterprise. - Fixed a bug in random graph generation where the resulting graph can end up with an incorrect relationship schema.
- Fixed a bug where a schema filter would not create a deep copy of the property schema map.
- Fixed a bug where modularity could have been incorrectly updated in ModularityOptimization. This may affect the number of performed iterations for ModularityOptimization or number of levels for Louvain.
- Fixed a bug where restoring from csv could not read values wrapped in quotes.
- Fixed a bug where KNN did not use the expected search space. This will improve the result but also increase the runtime.
- Fixed a bug in ML autotuning where
maxTrialsincluded model evaluations with concrete configs. - Fixed a bug where
gds.triangleCountandgds.localClusteringCoefficientwere allowed to run on directed graphs. - Fixed a bug in
gds.graph.exportand Arrow DB import where thewriteConcurrencywas not respected. - Fixed a bug with Node Operations where
gds.graph.nodeProperties.write,gds.graph.nodeProperties.dropandgds.graph.nodeProperties/y.streamwould not acceptStringinput for parametersnodeLabelsand/ornodeProperties. - Fixed a bug, where Node2Vec would report negative losses.
- Fixed a bug with
gds.graph.nodeProperties/y.stream, where the wrong nodes where returned when specifying anodeLabelsfilter and using Arrow. - Fixed a bug in the Louvain algorithm, where aggregating dense communities could potentially lead to an exception.
- Fixed a bug where model loading is attempted even for unlicensed user, which might fail database startup.
Improvements
- Better error handling in K-means
- Improve memory estimation for
gds.beta.pipeline.linkPrediction.trainwhen the nodePropertySteps used a weighted graph. - Improve runtime of feature generation in
gds.beta.linkPrediction.[train|predict]. - Improve performance of
gds.graph.project.cypherby using the subscriber API. - Improve convergence criteria for
LogisticRegressionandLinearRegressiontrainers, by making it independent of the number of batches. This affectsgds.alpha.pipeline.nodeRegression.train,gds.beta.pipeline.[linkPrediction|nodeClassification].train. - Improve error handling on invalid user input.
- Cypher on GDS projections is now capable of setting labels on nodes.
- Promoting CELF algorithm to `bet...
Graph Data Science 2.1.13
GDS 2.1.13 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Bug Fixes
-
gds.graph.nodeProperties.write,gds.graph.nodeProperties.drop,gds.graph.nodeProperty.streamandgds.graph.nodeProperties.streamnow acceptStringinput for parametersnodeLabelsand/ornodeProperties. -
gds.graph.nodeProperty.streamandgds.graph.nodeProperties.stream, would return the wrong nodes when specifying anodeLabelsfilter when using Arrow. - Louvain algorithm would throw an exception when aggregating dense communities.
Improvements
-
Export to CSV now enabled when GDS is running on a Causal Cluster Read Replica
-
gds.beta.graph.export.csv -
gds.beta.graph.export.csv.estimate
-
Graph Data Science 2.1.12
GDS 2.1.12 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Improvements
-
New procedures for enabling and disabling Arrow database import (default: enabled)
gds.features.enableArrowDatabaseImportgds.features.enableArrowDatabaseImport.reset
Graph Data Science 2.1.11
GDS 2.1.11 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Bug Fixes
-
gds.graph.exportand Arrow DB import where thewriteConcurrencywas not respected
Graph Data Science 2.1.10
GDS 2.1.10 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Bug Fixes
- Modularity Optimization and Louvain may run into an ArrayIndexOutOfBoundsException. This bug was introduced in 2.1.8.
-
gds.alpha.create.cypherdbwould not clean up internal state when encountering an unexpected error. -
gds.alpha.backup,gds.alpha.restoreandgds.beta.project.subgraphwould lose information on relationship projections. This made some algorithms unable to run on graphs produced through the above procedures.