Skip to content

Conversation

@jonwzheng
Copy link
Contributor

This PR seeks to update RMG-website to be compatible with Python 3.9 so that it can leverage the newest version of RMG-Py and other packages.

To do this, a number of things need to be updated simultaneously (avoid dependency hell):

  • Either update solprop using the py3.9 compatible version, or turn it into an API service so that it can be queried separately. Updating solprop would for now be the most straightforward option. Although an API service would be the best solution technologically, it would take the most technical load to learn and implement sustainably. Another way is to call it via a subprocess in its own separate conda or docker environment, which would also require some rewriting of the backend. Key concern is whether it's fast enough. In the future, probably we want to move toward one of the two latter options, especially if we want to deploy more ML models on the site.
  • Update Django from 2.2 and then see what breaks; then fix those errors in the codebase.
  • See if any new package versions break existing code

There's also an opportunity to make the conda environment leaner (which should make solving for it more straightforward in general).

@jonwzheng
Copy link
Contributor Author

Roel has very kindly released a binary of solprop that is compatible with this environment, so we can proceed with using that for now.

@jonwzheng
Copy link
Contributor Author

jonwzheng commented Jul 3, 2025

To-fix:

  • SoluteML not working
  • 'ifequal', expected 'endblock' in html rendering of thermo+kinetic database entries.
  • get_solvation_data_temp_dep too many values to unpack (expected 2)
  • Solubility models: Need to refactor to work with SolubilityData. Also the trained logS model is not present, so the binary doesn't work.
  • Test tools
    • NOT NULL constraint failed: rmg_fluxdiagram.java for Generate Flux Diagram. Make migration.
    • Populate Reactions has some error with chemkin
      • somehow fixed after making migrations
  • Test P-dep network
  • Try graphviz again
  • Django3.2
    • Update django.db.models.AutoField wherever it appears. I think fixed in migration?
  • Environment
    • Pin to django=4.2

@jonwzheng
Copy link
Contributor Author

Unrelated note, but the entryEdit functionality seems kind of suspicious and maybe we shouldn't allow users to actually use those (I don't think anyone is, and they don't even work on the main site.) We may want to just remove it.

@jonwzheng
Copy link
Contributor Author

jonwzheng commented Jul 4, 2025

Some Solute (tdep, soluteML) functionalities may be present in SolProp and thereby maybe we don't need chemprop_solvation.

- update descriptastorus old pin (was causing issues/outdated)
- update solprop mix to new repo
- add cantera pin because of weird glitch?
@jonwzheng
Copy link
Contributor Author

This will almost be ready. It looks like the new environment is working fine now.
Just need to update any code that was broken by these changes, and should be all set...

@jonwzheng
Copy link
Contributor Author

The main remaining issue is that using reference solubility data is not yet available. Here is my correspondence w/ Simona:

In SolubilityModels line 44, g_models is commented out and therefore inaccessible.
In SolubilityPredictions line 95:
self.gsolv_ref = (
self.make_gsolvref_predictions(verbose=verbose)
if predict_reference_solvents and self.models.g_models is not None
else None
)
because g_models is instantiated as None and cannot be modified, self.gsolv_ref will always also be None.
Therefore, in SolubilityCalculations line 222:

 self.gsolv_ref_298, self.unc_gsolv_ref_298 = self.extract_predictions(
      predictions.gsolv_ref
 )

self.logk_ref_298 = self.calculate_logk(gsolv=self.gsolv_ref_298)

Because predictions.gsolv_ref is None, then gsolv_ref_298 will be None, and when that is passed to the logK calculation, a TypeError is produced.

In the meantime, I think a link/header to fastsolv will instead be better.

Note, reference solubility data is not yet available.
This will require changes to SolProp
Solprop still has some issues so fastsolv might be a good alternative in
the interim. Also we just recommend it because it's a good model.
@jonwzheng jonwzheng marked this pull request as ready for review October 17, 2025 15:53
@jonwzheng
Copy link
Contributor Author

jonwzheng commented Oct 17, 2025

This PR is now ready for review. If you want to toy around with it, it's active on the dev site: https://rmg.mit.edu:888/

There's a quirk to the installation process. Sometimes a pip version of chemprop_solvation=0.0.2 gets installed somewhere, but that version is broken so it needs to be uninstalled via pip during the install process. please do feel free to test it on your local devices if you want to test it out.

Please let me know if you have any questions or want to chat about any of this

@BonhyeokKoo
Copy link

BonhyeokKoo commented Oct 18, 2025

I've confirmed that SoluteML is returning unphysical values in the current development version. This is the same issue described in fhvermei/chemprop_solvation#3, caused by changes in newer descriptastorus versions.

What issues did keeping the older pin of descriptastorus cause?

@jonwzheng
Copy link
Contributor Author

jonwzheng commented Oct 21, 2025

Old versions of descriptastorus use scipy.gilbrat which was deprecated in scipy 1.9.0 and removed entirely in 1.11.0 because it is a typo: https://docs.scipy.org/doc/scipy-1.10.1/reference/generated/scipy.stats.gilbrat.html

This renders the website inoperable. So we need to use a more recent version of descriptastorus. Maybe 2.5.1 will work? I'll give it a try...

@jonwzheng
Copy link
Contributor Author

Another option could be to not use chemprop_solvation at all and just use solprop_ml which I think has a re-trained version of SoluteML: https://anaconda.org/simonabuzzi/solprop_ml_mix

This would require some changes to the code though.

We observed non-physical SoluteML values due presumably to a new version
of descripastorus.
However, old versions use the old name of "gilbrat" instead of "gibrat",
a name that was removed in scipy 1.11.0. And at the same time RMG
requires scipy >=1.9.0 for MILP functionality.
@jonwzheng
Copy link
Contributor Author

Testing other versions of descriptastorus, none of them are compatible:

  • 2.0.6 and earlier have the same scipy problem
  • Later versions have some other packaging issues.

I see we can get descripastorus=2.5.0 by pinning the scipy version (which isn't a tenable long-term solution, but the env can at least be solved.)

Unfortunately, it looks like this still results in non-physical answers. Even trying descriptastorus=2.2.0 results in the non-physicality. So maybe it's something else?

SoluteML package seems busted. We might need to rely on the newly
retrained version instead.
The new soluteML version code can process a batch of inputs, so the code is
rewritten to accommodate this.
Gsolv model was called instead of Hsolv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Group additivity graphs fail to be drawn Flux diagram not working Update to Django 4.2

3 participants