Skip to content

Conversation

@VeckoTheGecko
Copy link
Contributor

@VeckoTheGecko VeckoTheGecko commented Oct 31, 2025

(posting now for visibility and to request feedback/pixi debugging help)

Overview

Fixes #10732

This PR migrates the dev workflow and CI for Xarray across to Pixi, providing the following benefits:

  • Composable environments via dependency groups (in pixi called "features")
  • Support for multiple environments
  • Task running
  • lock file support

See the original issue for more info.

Changes so far in this PR:

  • Add pixi badge to readme
  • Migrated most environment files to Pixi config in pixi.toml split apart into features that I thought were sensible . I left out environment-benchmarks.yml, binder/environment as that has interactions with asv, and Binder - this PR is already big enough, and I think those should be explored another time.
    • I made the environments in Pixi have similar names as the original conda environments to ease migration
  • Introduced a cache-pixi-lock.yml workflow (see below section "Considerations")
  • Updated ci.yaml
    • Fixed now! 98% there - for some reason the CI of Pixi is finding which pytest to be .pixi/envs/default/Scripts/pytest while local pixi run -e test-all-deps-py313 which pytest is finding .pixi/envs/test-all-deps-py313/bin/pytest (see test-pixi-dust branch, example action run) . Any ideas why @lucascolley ?
  • Update CI additional
  • Update RTD build
  • Migrate minimal environment to Pixi as well
  • Update contributing guidelines (see "Feedback wanted" section below)
  • Migrate hypothesis tests
  • Migrate nightly dev testing

I've tried to make the commits tidy to help with reviewing commit by commit, which might be easier. I also was quite diligent when migrating from the old env files to make sure versions were the same.

Testing instructions

Resources: Pixi Scipy 2025 talk | Docs: Manifest Reference

  • pixi info -> show info about the pixi environments
  • Build documentation: pixi run doc
  • Run tests: pixi run test then choose the environment you want to run the tests in (or pixi run -e environment_name test)
    • Most often you'll want the test-all-deps-py313 environment (corresponding to the old environment ci/requirements/environment.yml)
  • Run pre-commit: pixi run pre-commit
  • Run mypy typing: pixi run typing

Enter an environment (equivalent to conda activate): pixi shell -e env_name
Exit an environment (equivalent to conda activate): exit or Ctrl+D

See all tasks: pixi run

Considerations

Lock files o' lock files

There was some interesting conversation in #10732 (comment) about lock files. To summarise:

We have two choices to handle the lock files, either (a) generate them in CI, or (b) commit them to the repo and periodically update them.

(a) generating in CI (done in this PR):

  • add pixi.lock to .gitignore
  • have a workflow which generates the lock file. Cache this under a key that is date + hash(pixi.toml)
  • have all workflows restore this pixi.lock file for environment creation

Pros:

  • lock file is only generated once a day and shared across workflows - saving 40s per run
  • close to what was previously done (with daily caches)
  • minimal changes to workflows (only need to add a few lines - cache-pixi-lock.yml is re-usable across different projects).

Cons:

  • Mismatch of devs pixi.lock and what's in CI. Local developers need to periodically delete pixi.lock and regenerate it.
  • Missed benefit of perfectly reproducible dev environments cross developers and with CI

(b) commit the lock files

(I think this is the gist of it)

  • commit the lock file (now local devs and CI can use this lockfile) - around 40k lines
  • add GitHub PR automation to automatically update the lockfile every 3 weeks
  • Most of CI works from committed lockfile, but there can be a job bleeding-edge which runs every few days by taking the current lockfile, running an update, and then running tests. Any failures can be automatically reported in an issue
    • Then, to "resolve" that issue you can add a pin in the pixi.toml manifest and talk with upstream to see whats up

Pros:

  • no need to generate lock files in CI
  • perfectly reproducible dev environments cross developers and CI

Cons:

  • there is a bunch of added complexity/maintainer burden to setting this up (automated workflows etc)

@lucascolley knows the full extent as he's been exploring this setup at Scipy

Conclusion

Approach (a) has minimal setup/maintenance with little downside. I think that it's a good solution for smaller projects in particular (we've adopted it at Parcels - cc @maxrjones might be interesting based on your comment )

Approach (b) is more robust if having the same environment between all devs is highly valued (@shoyer mentioned during a dev meeting that this would be good for xarray), but requires more setup.

I recommend we go for (a) as is done in this PR, and consider (b) separately .

@lucascolley would it be beneficial to do a write-up of all this on prefix.dev sometime to help guide others dealing with this? I'm happy to write or collab on a blog post.

Feedback wanted: To what extent do we promote Conda dev workflows

Yeah - I don't know. In the projects I'm working on I've gone full Pixi, but those are smaller projects.

I've deleted the old environment files to avoid duplication, but can re-add them to the extent which you want to support conda dev workflows.

I've held off on updating the contributing instructions for this reason.

EDIT: Joined the dev meeting - @keewis doesn't think its a bad idea to fully migrate dev instructions from conda to Pixi. Later (if people really want conda instructions) we can show how to use pixi to export a conda compatible env file - no need for us to maintain two separate env files.


I think that's about it! I don't think I've forgotten anything, but it is late on a Friday so maybe - will update if that's the case :)

Let me know if you want me to drop by the dev meeting on 5 Nov - but I'm happy to keep this async otherwise.


(🎉 for my first significant contribution to Xarray!!!)

- Using the bare-minimum.yml requirements file to act as a starting point to build the composable environments
- Add pixi.lock to gitignore (no need to commit lock files in library repos)
- Update .gitattributes (automatically done by pixi)
- Configure xarray as source dependency with dynamic versioning
Already migrated to pixi
Update requirements files to remove deps handled by Pixi
@VeckoTheGecko VeckoTheGecko marked this pull request as ready for review November 10, 2025 18:56
@VeckoTheGecko
Copy link
Contributor Author

OK, the only remaining item for this PR I think is investigating the nightly builds - though that's orthogonal somewhat and thought I'd mark as ready in the meantime.

I've done a self review of everything. I can imagine the contrib instructions is the item that will have the most suggestions from yall.

@VeckoTheGecko
Copy link
Contributor Author

VeckoTheGecko commented Nov 11, 2025

I've been implementing nightly support now (thanks @keewis for the pointer over in the Pixi Discord).

I've been running into difficulty with installing Dask (as well as "distributed" and "dask-expr") from git sources. The following UV manifest1 is a minimal reproducer

# pyproject.toml
[project]
name = "test"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = [
    "dask",
    "distributed",
    "dask-expr",
]

[tool.uv.sources]
dask = { git = "https://github.com/dask/dask" }
distributed = { git = "https://github.com/dask/distributed" }
# dask-expr = { git = "https://github.com/dask/dask-expr" }

with the following error

(test) 🦎test   main[?] ❯ uv lock
    Updated https://github.com/dask/distributed (fde9e5f4b464c8011526bfebecbb476d0323d404)
    Updated https://github.com/dask/dask (6708b43888bf3c0f16328e4f5cb34f94f584d043)
  × No solution found when resolving dependencies:
  ╰─▶ Because there is no version of dask==2025.11.0 and distributed==2025.11.0 depends on dask==2025.11.0, we can conclude that distributed==2025.11.0 cannot be used.
      And because only distributed==2025.11.0 is available and your project depends on distributed, we can conclude that your project's requirements are unsatisfiable.

Would someone more familiar dask development/installing from Git be able to provide some insight?

Footnotes

  1. note that Pixi uses uv for managing pypi deps - so solving this solves for pixi

Needed to add numcodecs since lock was complaining about unfound versions
Otherwise is the same as what was in `install-upstream-wheels.sh`
Nightly mypy and normal mypy were disagreeing - needed to add ignoring of unused-ignores

pixi run -e test-with-typing-py313 mypy
pixi run -e test-nightly python -m mypy --install-types --non-interactive
@github-actions github-actions bot added topic-plotting topic-arrays related to flexible array support topic-rolling labels Nov 11, 2025
with:
file: pyright_report/cobertura.xml
flags: pyright39
flags: pyright
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when combining this into a matrix, I assumed that the pyright39 flag wasn't important and that the PYTHON_VERSION env var was fine

@VeckoTheGecko
Copy link
Contributor Author

OK - all CI that needed to be migrated is working with the new Pixi setup (including nightlies). See https://github.com/VeckoTheGecko/xarray/tree/test-pixi-dust for the nightlies and pyright workflows (that branch just has an extra commit that enables the workflows for my fork)

I think this is ready for a full review cc @shoyer

@VeckoTheGecko
Copy link
Contributor Author

re. #10888 (comment), it looks like dask is pinned to be 2025.11.0 in the distributed (and assuming the dask-expr) package pyproject.tomls https://github.com/dask/distributed/blob/fde9e5f4b464c8011526bfebecbb476d0323d404/pyproject.toml#L31

@dcherian
Copy link
Contributor

Claude solved this by adding

[tool.uv]
override-dependencies = [
    "dask @ git+https://github.com/dask/dask@main",
]

This must be a uv thing? The tight pin on dask has been around for years.

@VeckoTheGecko
Copy link
Contributor Author

Thanks @dcherian ! Fixed in 8abc993

@dcherian
Copy link
Contributor

Hmm.. does that apply globally so every env tests against dask main when dask is requested?

@VeckoTheGecko
Copy link
Contributor Author

Hmm.. does that apply globally so every env tests against dask main when dask is requested?

No, only to the environments which have the nightly feature

pyarrow = "*"

distributed = { git = "https://github.com/dask/distributed" }
dask-expr = { git = "https://github.com/dask/dask-expr" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dask-expr = { git = "https://github.com/dask/dask-expr" }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Automation Github bots, testing workflows, release automation CI Continuous Integration tools dependencies Pull requests that update a dependency file topic-arrays related to flexible array support topic-documentation topic-plotting topic-rolling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using Pixi for environment management

6 participants