Update compatibility bounds, fixed tolerances and issues in quadratic regularization #194

matbesancon · 2025-08-05T09:37:22Z

No description provided.

matbesancon · 2025-08-05T09:37:42Z

I can't run the GPU tests locally but wanted to see if this fixed the CI

matbesancon · 2025-08-05T09:38:02Z

ping @devmotion if you can allow CI :)

test/gpu/Project.toml

Project.toml

examples/nmf/Project.toml

examples/variational/Project.toml

test/entropic/sinkhorn_unbalanced.jl

test/gpu/Project.toml

matbesancon · 2025-08-05T12:39:51Z

I was about to fix the tolerance in the simple GPU quadratic regularization but this might actually be an algorithmic issue more than numeric, both the CPU and GPU versions issue the warning that the semi-smooth Newton didn't converge.

This also occurs if I increase the maxiter to 1000, which is already too high for a Newton-type method

matbesancon · 2025-08-05T13:30:46Z

TIL sources in Project.toml 😮

matbesancon · 2025-08-05T13:51:40Z

The semi-smooth is stalling at this value:

┌ Debug: Semi-smooth Newton algorithm (486/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (487/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (488/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (489/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (490/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (491/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (492/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (493/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (494/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (495/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (496/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (497/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (498/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (499/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Debug: Semi-smooth Newton algorithm (500/500: absolute error of source marginal = 0.0075981272
└ @ OptimalTransport ~/.julia/dev/OptimalTransport/src/quadratic_newton.jl:200
┌ Warning: Semi-smooth Newton algorithm (500/500): not converged

…rt.jl into update_compat

matbesancon · 2025-08-05T14:25:08Z

alright, slowly getting there

test/gpu/simple_gpu.jl

devmotion · 2025-08-06T05:55:57Z

src/quadratic_newton.jl

    G[(M + 1):end, 1:M] .= σ'
-    # G[diagind(G)] .+= δ # regularise cg
-    G += δ * I
+    view(G, diagind(G)) .+= δ # regularise cg


@matbesancon this shouldn't matter but here G wasn't updated in-place.

G wasn't regularized altogether in the current master then?

It was, it's only used in this function here, so it doesn't matter whether it's updated in-place or not. It just unintentionally caused additional allocations in every iteration which one would like to get rid of completely by caching G upfront.

src/quadratic_newton.jl

test/gpu/simple_gpu.jl

matbesancon · 2025-08-06T11:49:45Z

still no convergence of the semi-smooth Newton in the GPU example but well, might just be inherent to the method? 🤷

devmotion · 2025-08-06T11:53:16Z

still no convergence of the semi-smooth Newton in the GPU example but well, might just be inherent to the method? 🤷

Yeah, I'm not completely sure. My impression is that the tests are quite bad - other methods also show convergence warnings and even POT fails/warns on some test examples.

codecov · 2025-08-06T11:57:01Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.28%. Comparing base (9da044c) to head (3394f67).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #194      +/-   ##
==========================================
- Coverage   95.30%   94.28%   -1.02%     
==========================================
  Files          14       14              
  Lines         681      683       +2     
==========================================
- Hits          649      644       -5     
- Misses         32       39       +7

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

matbesancon · 2025-08-06T11:57:23Z

I'm a bit puzzled that this is still not converging (from buildkite logs), I got it to work fine locally with the latest version

matbesancon · 2025-08-06T12:03:10Z

But at least CI is green and we can update to the latest packages 🥳

devmotion · 2025-08-06T12:10:36Z

I'm a bit puzzled that this is still not converging (from buildkite logs), I got it to work fine locally with the latest version

Did you run the GPU test locally (without GPUs)? Or the non-GPU test? The GPU and non-GPU tests are different, e.g., the former uses random histograms whereas the latter uses uniform marginal measures, and they use different regularization parameters.

matbesancon · 2025-08-06T12:22:35Z

Did you run the GPU test locally (without GPUs)?

I ran the CPU part of the GPU test. Since two warnings are produced, it means the algorithm doesn't converge, regardless of the hardware it runs on

devmotion · 2025-08-06T12:44:12Z

I just checked out the PR locally. I don't see any convergence issues in the GPU tests locally, everything works and converges as expected 😕 Only if I decrease the regularization parameter to very small values (e.g. 1f-5) I see convergence issues.

matbesancon · 2025-08-06T12:59:16Z

I would be surprised if buildkite is caching anything. For now we can merge this and see in later things whether this is still an issue?

devmotion · 2025-08-06T13:40:40Z

No, I'm not worried about caches, most likely it's just due to the different hardware architecture (I'm running it on an Apple M2). But it shows that most likely there isn't a general problem with the algorithm and/or the test but at most it's a slightly brittle test.

I went through the tests more generally and fixed and improved a few things. I think generally the regularization parameters were unnecessarily small (which caused convergence failures in a few cases, both in OptimalTransport and POT). Additionally, some comparisons with POT were just failing due to a too high threshold/tolerance in POT. Moreover, the comparisons of the unbalanced Sinkhorn algorithm with POT were actually failing since POT by default uses a different regularization; so instead of adjusting the tolerances, it was sufficient to tell POT to also use entropy regularization.

matbesancon · 2025-08-06T16:11:40Z

@devmotion I think that's good to go?

zsteve and others added 4 commits February 15, 2025 14:51

update compat bounds

c0a92f3

update buildkite julia ver

254cd21

rollback

94ed3e8

fix approx tol

907e067

matbesancon added 2 commits August 5, 2025 12:27

action update

786c803

action all update

f07622d

devmotion reviewed Aug 5, 2025

View reviewed changes

matbesancon and others added 8 commits August 5, 2025 12:38

various bounds

cce5973

atol

a4abb62

remove

301ce5d

Apply suggestions from code review

ca1cc63

More updates

5e7115f

Use conda installation on buildkite

eb59f4b

Remove Manifest.toml files

6a68302

Update .gitignore

42ff9a1

devmotion added 3 commits August 5, 2025 13:15

Update examples/nmf/Project.toml

7d08dc3

Run CI on macOS

15b18b0

Use sources in Project.toml

75a617b

matbesancon and others added 5 commits August 5, 2025 15:54

atol on gpu test

cab418e

Remove strict kwarg from makedocs

3a2a799

Add julia cache to docs CI

25bf662

fix f32

7115666

Merge branch 'update_compat' of github.com:matbesancon/OptimalTranspo…

a7306d8

…rt.jl into update_compat

devmotion reviewed Aug 5, 2025

View reviewed changes

test/gpu/simple_gpu.jl Outdated Show resolved Hide resolved

devmotion reviewed Aug 6, 2025

View reviewed changes

src/quadratic_newton.jl Outdated Show resolved Hide resolved

devmotion added 3 commits August 6, 2025 06:06

Update simple_gpu.jl

3c4890a

Update regularization parameter in the test

dcfcbb9

Update src/quadratic_newton.jl

edaa9aa

devmotion reviewed Aug 6, 2025

View reviewed changes

test/gpu/simple_gpu.jl Outdated Show resolved Hide resolved

test/gpu/simple_gpu.jl Outdated Show resolved Hide resolved

Try to revert GPU tests

87a82b5

matbesancon changed the title ~~Update compat - fixed tolerances~~ Update compatibility bounds, fixed tolerances and issues in quadratic regularization Aug 6, 2025

devmotion added 2 commits August 6, 2025 15:33

Fix format action

b65a516

Fix and improve tests

cd65401

matbesancon and others added 4 commits August 6, 2025 15:52

formatting

00759b7

typo

57d1715

Fix format

d228db3

Set permissions for GH actions

3394f67

devmotion approved these changes Aug 6, 2025

View reviewed changes

devmotion merged commit af842b5 into JuliaOptimalTransport:master Aug 6, 2025
16 of 17 checks passed

matbesancon deleted the update_compat branch August 6, 2025 19:58

matbesancon mentioned this pull request Aug 6, 2025

update compat bounds #189

Closed

devmotion mentioned this pull request Aug 6, 2025

Extend compat and bump version #178

Closed

Update compatibility bounds, fixed tolerances and issues in quadratic regularization #194

Update compatibility bounds, fixed tolerances and issues in quadratic regularization #194

Uh oh!

Conversation

matbesancon commented Aug 5, 2025

Uh oh!

matbesancon commented Aug 5, 2025

Uh oh!

matbesancon commented Aug 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

matbesancon commented Aug 5, 2025

Uh oh!

matbesancon commented Aug 5, 2025

Uh oh!

matbesancon commented Aug 5, 2025

Uh oh!

matbesancon commented Aug 5, 2025

Uh oh!

Uh oh!

devmotion Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

matbesancon Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

devmotion Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

matbesancon commented Aug 6, 2025

Uh oh!

devmotion commented Aug 6, 2025

Uh oh!

codecov bot commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

matbesancon commented Aug 6, 2025

Uh oh!

matbesancon commented Aug 6, 2025

Uh oh!

devmotion commented Aug 6, 2025

Uh oh!

matbesancon commented Aug 6, 2025

Uh oh!

devmotion commented Aug 6, 2025

Uh oh!

matbesancon commented Aug 6, 2025

Uh oh!

devmotion commented Aug 6, 2025

Uh oh!

matbesancon commented Aug 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Aug 6, 2025 •

edited

Loading