Skip to content

Commit edd6b16

Browse files
authored
fix(gfql): validate chain construction (#861)
* fix(gfql): validate chain construction * docs(gfql): clarify validation defaults * test(gfql): cover nested/from_json validation * chore(mypy): let runner set python version; ignore pandas where(None) arg * chore(mypy): silence pandas where(None) overload * chore(kusto): type-safe null normalization
1 parent eef553f commit edd6b16

File tree

14 files changed

+162
-101
lines changed

14 files changed

+162
-101
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ demos/data/BIOGRID-IDENTIFIERS-3.3.123.tab.txt
7474

7575
# Ignore local Python virtualenv folders (used by Jenkins when running tests, etc.)
7676
/.pyenv*/
77+
.venv/
7778

7879
# Ignore IDE stuff
7980
.idea

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,12 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
88
## [Development]
99
<!-- Do Not Erase This Section - Used for tracking unreleased changes -->
1010

11+
### Fixed
12+
- **GFQL:** `Chain` now validates on construction (matching docs) and rejects invalid hops immediately; pass `validate=False` to defer validation when assembling advanced flows (fixes #860).
13+
14+
### Docs
15+
- **GFQL validation:** Clarified `Chain` constructor validation defaults, `validate=False` defer option, validation phases, and guidance for large/nested ASTs to reduce redundant validation (issue #860).
16+
1117
## [0.46.0 - 2025-12-01]
1218

1319
### Added

demos/demos_databases_apis/memgraph/visualizing_iam_dataset.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -289,7 +289,7 @@
289289
"cell_type": "markdown",
290290
"metadata": {},
291291
"source": [
292-
"![Screenshot](https://github.com/karmenrabar/pygraphistry_images/blob/main/memgraphlab.png?raw=true)"
292+
"![Screenshot](https://raw.githubusercontent.com/karmenrabar/pygraphistry_images/refs/heads/main/memgraphlab.png)"
293293
]
294294
},
295295
{

docs/source/gfql/spec/llm_guide.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
- [Graph Algorithms](#graph-algorithms) - PageRank, Louvain, UMAP, Hypergraph
2626
- [Visualization](#visualization) - Colors, icons, sizes
2727
- [Layouts](#layouts) - FA2 default, ring layouts
28-
- [Multi-Step (Let/Ref)](#multi-step-letref) - DAG composition
28+
- [Multi-Step (Let/Ref)](#let-multi-step) - DAG composition
2929

3030
**Domain Guidance:**
3131
- [Icons & Palettes](#domain-guidance) - By vertical (Cyber, Fraud, Gov, Social, Supply Chain, Events)

docs/source/gfql/spec/python_embedding.md

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ GFQL provides comprehensive validation to catch errors early:
105105

106106
### Syntax Validation
107107

108-
Operations are automatically validated during construction:
108+
Chains validate on construction by default. Nodes, edges, predicates, refs, calls, and remote graphs are validated when a parent `Chain`/`Let` validates them or when you call `.validate()` directly. Schema validation is a separate, data-aware pass.
109109

110110
```python
111111
from graphistry.compute.chain import Chain
@@ -118,6 +118,28 @@ chain = Chain([
118118
])
119119
```
120120

121+
For advanced flows (large/nested ASTs or staged assembly), you can defer structural validation and run it once after assembly:
122+
123+
```python
124+
# Defer validation while building
125+
chain = Chain([
126+
n({'type': 'person'}),
127+
e_forward({'hops': -1})
128+
], validate=False) # No validation yet
129+
130+
# Later, validate once (or let g.gfql validate it)
131+
chain.validate() # Raises GFQLTypeError: hops must be positive
132+
```
133+
134+
Use deferred validation to avoid re-validating nested `Chain`/`Let` wrappers during assembly; keep the defaults for typical workflows so mistakes surface immediately.
135+
136+
### Validation Phases
137+
138+
- **Constructor defaults:** `Chain([...])` and `Let(...)` validate immediately; pass `validate=False` to defer.
139+
- **Parent-driven checks:** AST operations (`Node`, `Edge`, predicates, `Ref`, `Call`, `RemoteGraph`) validate when their parent validates, or via explicit `.validate()`.
140+
- **JSON defaults:** `to_json` / `from_json` default to `validate=True`, which runs structural validation during serialization/deserialization.
141+
- **Schema validation:** Use `validate_chain_schema(g, chain)` or `g.gfql(..., validate_schema=True)` to verify column/type compatibility before execution.
142+
121143
### Schema Validation
122144

123145
You have two options for validating queries against your data schema:
@@ -409,4 +431,4 @@ result = g.gfql(let({
409431
## See Also
410432

411433
- {ref}`gfql-spec-language` - Core language specification
412-
- [GFQL Quick Reference](../quick.rst) - Python API examples
434+
- [GFQL Quick Reference](../quick.rst) - Python API examples

docs/source/graphistry.compute.gfql_validation.rst

Lines changed: 3 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,10 @@
11
graphistry.compute.gfql\_validation package
22
===========================================
33

4-
Submodules
5-
----------
4+
.. note::
65

7-
graphistry.compute.gfql\_validation.exceptions module
8-
-----------------------------------------------------
9-
10-
.. automodule:: graphistry.compute.gfql_validation.exceptions
11-
:members:
12-
:undoc-members:
13-
:show-inheritance:
14-
15-
graphistry.compute.gfql\_validation.validate module
16-
---------------------------------------------------
17-
18-
.. automodule:: graphistry.compute.gfql_validation.validate
19-
:members:
20-
:undoc-members:
21-
:show-inheritance:
22-
23-
Module contents
24-
---------------
6+
Deprecated. Use :mod:`graphistry.compute.gfql` instead. This module only re-exports
7+
the GFQL validation APIs for backward compatibility.
258

269
.. automodule:: graphistry.compute.gfql_validation
2710
:members:

docs/source/graphistry.compute.rst

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -61,14 +61,6 @@ graphistry.compute.chain\_remote module
6161
:undoc-members:
6262
:show-inheritance:
6363

64-
graphistry.compute.chain\_validate module
65-
-----------------------------------------
66-
67-
.. automodule:: graphistry.compute.chain_validate
68-
:members:
69-
:undoc-members:
70-
:show-inheritance:
71-
7264
graphistry.compute.cluster module
7365
---------------------------------
7466

graphistry/compute/ast.py

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -662,22 +662,26 @@ def from_json(cls, d: dict, validate: bool = True) -> 'ASTEdge':
662662

663663
class ASTLet(ASTObject):
664664
"""Let-bindings for named graph operations in a DAG.
665-
666-
Allows defining reusable graph operations that can reference each other,
665+
666+
Lets you define reusable graph operations that can reference each other,
667667
forming a directed acyclic graph (DAG) of computations.
668-
669-
:param bindings: Dictionary mapping names to graph operations
670-
:type bindings: Dict[str, Union[ASTObject, Chain, Plottable]]
671-
672-
:raises GFQLTypeError: If bindings is not a dict or contains invalid keys/values
673-
674-
**Example::**
675668
676-
# Matchers now supported directly (operate on root graph)
677-
dag = ASTLet({
678-
'persons': n({'type': 'person'}),
679-
'friends': ASTRef('persons', [e_forward({'rel': 'friend'})])
680-
})
669+
Parameters
670+
----------
671+
bindings : Dict[str, Union[ASTObject, Chain, Plottable]]
672+
Mapping from binding names to graph operations (AST objects or Plottables).
673+
674+
Raises
675+
------
676+
GFQLTypeError
677+
If ``bindings`` is not a dict or contains invalid keys/values.
678+
679+
Example
680+
-------
681+
>>> dag = ASTLet({
682+
... "persons": n({"type": "person"}),
683+
... "friends": ASTRef("persons", [e_forward({"rel": "friend"})])
684+
... })
681685
"""
682686
bindings: Dict[str, Union['ASTObject', 'Chain', Plottable]]
683687

graphistry/compute/chain.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,11 @@
2525

2626
class Chain(ASTSerializable):
2727

28-
def __init__(self, chain: List[ASTObject]) -> None:
28+
def __init__(self, chain: List[ASTObject], validate: bool = True) -> None:
2929
self.chain = chain
30+
if validate:
31+
# Fail fast on invalid chains; matches documented automatic validation behavior
32+
self.validate(collect_all=False)
3033

3134
def validate(self, collect_all: bool = False) -> Optional[List['GFQLValidationError']]:
3235
"""Override to collect all chain validation errors."""
@@ -116,9 +119,7 @@ def from_json(cls, d: Dict[str, JSONVal], validate: bool = True) -> 'Chain':
116119
f"Chain field must be a list, got {type(d['chain']).__name__}"
117120
)
118121

119-
out = cls([ASTObject_from_json(op, validate=validate) for op in d['chain']])
120-
if validate:
121-
out.validate()
122+
out = cls([ASTObject_from_json(op, validate=validate) for op in d['chain']], validate=validate)
122123
return out
123124

124125
def to_json(self, validate=True) -> Dict[str, JSONVal]:

graphistry/compute/gfql_validation/__init__.py

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,12 @@
11
"""
2-
DEPRECATED: This module is deprecated and will be removed in a future version.
2+
DEPRECATED: Use ``graphistry.compute.gfql`` instead.
33
4-
All functionality has been moved to graphistry.compute.gfql.
5-
Please update your imports:
6-
FROM: graphistry.compute.gfql_validation
7-
TO: graphistry.compute.gfql
4+
All functionality moved to ``graphistry.compute.gfql``. Update imports:
85
9-
This duplicate module was created accidentally during code extraction and
10-
provides no additional functionality.
6+
- From: ``graphistry.compute.gfql_validation``
7+
- To: ``graphistry.compute.gfql``
8+
9+
This duplicate module was created during code extraction and provides no additional functionality.
1110
"""
1211

1312
import warnings

0 commit comments

Comments
 (0)