Skip to content

Commit 9f5bb83

Browse files
feat: Add pixi project configuration (#227)
* Add pixi manifest (pixi.toml) and pixi lockfile (pixi.lock) to fully specify the project dependencies. This provides a multi-environment multi-platform (Linux, macOS) lockfile. * In addition to the default feature, add 'latest', 'cms-open-data-ttbar', and 'local' features and corresponding environments composed from the features. The 'cms-open-data-ttbar' feature is designed to be compatible with the Coffea Base image which uses SemVer coffea (Coffea-casa build with coffea 0.7.21/dask 2022.05.0/HTCondor and cheese). - The cms-open-data-ttbar feature has a 'install-ipykernel' task that installs a kernel such that the pixi environment can be used on a coffea-casa instance from a notebook. - The local features have the canonical 'start' task that will launch a jupyter lab session inside of the environment. * Add use instructions for the pixi environments to the cms-open-data-ttbar README.
1 parent 62f51e7 commit 9f5bb83

File tree

5 files changed

+23190
-1
lines changed

5 files changed

+23190
-1
lines changed

.gitattributes

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
11
*.model filter=lfs diff=lfs merge=lfs -text
2+
# GitHub syntax highlighting
3+
pixi.lock linguist-language=YAML linguist-generated=true

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,3 +33,7 @@ analyses/cms-open-data-ttbar/metrics
3333

3434
# dask
3535
dask-worker-space/
36+
37+
# pixi environments
38+
.pixi
39+
*.egg-info

analyses/cms-open-data-ttbar/README.md

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,57 @@ This directory is focused on running the CMS Open Data $t\bar{t}$ analysis throu
2020
| utils/config.py | This is a general config file to handle different options for running the analysis. |
2121
| utils/hepdata.py | Function to create tables for submission to the [HEP_DATA website](https://www.hepdata.net) (use `HEP_DATA = True`) |
2222

23+
#### Setting up the environment
24+
25+
##### On Coffea-casa
26+
27+
1. Install [`pixi`](https://pixi.sh/latest/#installation).
28+
2. From the top level of the entire repository run
29+
30+
```
31+
pixi run --environment cms-open-data-ttbar install-ipykernel
32+
```
33+
34+
This will install all of the software and create an `ipykernel` that the Coffea-casa Jupyter Lab instance will be able to see.
35+
36+
3. In the Coffea-casa Jupyter Lab browser, navigate and open up the `analyses/cms-open-data-ttbar/ttbar_analysis_pipeline.ipynb`.
37+
4. Change the kernel of the notebook to be `cms-open-data-ttbar`.
38+
39+
##### On a local machine
40+
41+
To get a local Python environment that has all the software required for the analysis:
42+
43+
1. Install [`pixi`](https://pixi.sh/latest/#installation) on your machine.
44+
2. Update `analyses/cms-open-data-ttbar/utils/config.py` to use `"local"` for the `"AF"` key.
45+
46+
```
47+
sed -i 's/"AF": "coffea_casa"/"AF": "local"/g' analyses/cms-open-data-ttbar/utils/config.py # Linux
48+
```
49+
```
50+
sed -i '' 's/"AF": "coffea_casa"/"AF": "local"/g' analyses/cms-open-data-ttbar/utils/config.py # macOS
51+
```
52+
3. From the top level of the entire repository run
53+
54+
```
55+
pixi run --environment local-cms-open-data-ttbar start
56+
```
57+
58+
This will install all of the software and launch a Jupyter lab session.
59+
You can then use the file navigator and terminal in Jupyter lab to navigate to this directory to run the analysis.
60+
61+
**Note**: Given the size of the files, when running locally you will probably want to set the `USE_SERVICEX` global configuration variable in the `analyses/cms-open-data-ttbar/ttbar_analysis_pipeline.ipynb` notebook to `True`
62+
63+
```python
64+
USE_SERVICEX = True
65+
```
66+
67+
This requires you to have a ServiceX configuration file on your machine.
68+
2369
#### Instructions for paired notebook
2470

2571
If you only care about running the `ttbar_analysis_pipeline.ipynb` notebook, you can completely ignore the `ttbar_analysis_pipeline.py` file.
2672

27-
This notebook (`ttbar_analysis_pipeline.ipynb`) is paired to the file `ttbar_analysis_pipeline.py` via Jupytext (https://jupytext.readthedocs.io/en/latest/). Using `git diff` with this file instead of the `.ipynb` file is much simpler, as you don't have to deal with notebook metadata or output images. However, in order for the notebook output to be preserved, the notebook still needs to be version controlled. It is ideal to run `git diff` with the option `-- . ':(exclude)*.ipynb'`, so that `.ipynb` files are ignored.
73+
This notebook (`ttbar_analysis_pipeline.ipynb`) is paired to the file `ttbar_analysis_pipeline.py` via Jupytext (https://jupytext.readthedocs.io/en/latest/). Using `git diff` with this file instead of the `.ipynb` file is much simpler, as you don't have to deal with notebook metadata or output images. However, in order for the notebook output to be preserved, the notebook still needs to be version controlled. It is ideal to run `git diff` with the option `-- . ':(exclude)*.ipynb'`, so that `.ipynb` files are ignored.
2874

2975
The `.py` file can also be run as a Python script.
3076

0 commit comments

Comments
 (0)