implement preds.py as a Rust wheel with PyO3 & Maturin #3181
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change is a proof of concept for how possible it would be to take
individual python modules and re-implement them in Rust, in a "Ship of
Theseus" style Rust rewrite.
This takes some of the simplest parts of the codebase - a list of
predicate free functions - and turns them into Rust checks. These
functions become quite a bit more trivial in Rust as Rust implements
many of these functions on
char, so we simply delegate to thosemethods.
These functions where chosen not because they're a slow path, but more
because they are straightforward functions that return booleans and show
demonstrate a good proof of concept. They certainly don't make things
slower, however.
The next steps would be to take some of the bigger functions in preds.py
that do full string comparison (such as isXMLishTagname), and port those.
This was avoided for now as these might be slightly more controversial,
as we might want to include dependencies for fast string matching.
I'm not very well versed in Python, this was mostly cobbled together using
https://medium.com/@MatthieuL49/a-mixed-rust-python-project-24491e2af424
and https://colliery.io/blog/rust-python-pattern/ as guides.
Some completely non empirical evaluation of timings, I ran the tests
with/without Rust bindings:
As expected both take essentially the same time. This may prove 1 of 2
things:
for simple checks like these).