Skip to content

Commit 6f570a7

Browse files
authored
change readme from rst to md
1 parent 11b6e14 commit 6f570a7

File tree

2 files changed

+117
-144
lines changed

2 files changed

+117
-144
lines changed

README.md

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
2+
# Treeffuser
3+
4+
[![PyPI version](https://badge.fury.io/py/treeffuser.svg)](https://badge.fury.io/py/treeffuser)
5+
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
6+
[![GitHub Stars](https://img.shields.io/github/stars/blei-lab/treeffuser?style=flat&logo=GitHub)](https://github.com/blei-lab/treeffuser/stargazers)
7+
[![PyPI - Downloads](https://img.shields.io/pypi/dm/treeffuser)](https://pypi.org/project/treeffuser/)
8+
[![Website](https://img.shields.io/badge/website-visit-blue?label=website)](https://blei-lab.github.io/treeffuser/)
9+
[![Documentation](https://img.shields.io/badge/docs-passing-green)](https://blei-lab.github.io/treeffuser/docs/getting-started.html)
10+
[![arXiv](https://img.shields.io/badge/arXiv-2406.07658-red)](https://arxiv.org/abs/2406.07658)
11+
12+
Treeffuser is an easy-to-use package for **probabilistic prediction on tabular data with tree-based diffusion models**.
13+
It estimates distributions of the form `p(y|x)` where `x` is a feature vector and `y` is a target vector.
14+
Treeffuser can model conditional distributions `p(y|x)` that are arbitrarily complex (e.g., multimodal, heteroscedastic, non-Gaussian, heavy-tailed, etc.).
15+
16+
It is designed to adhere closely to the scikit-learn API and require minimal user tuning.
17+
18+
<h3 align="center">
19+
<b><a href="https://blei-lab.github.io/treeffuser/">Website</a></b> |
20+
<b><a href="https://github.com/blei-lab/treeffuser/">GitHub</a></b> |
21+
<b><a href="https://blei-lab.github.io/treeffuser/docs/getting-started.html">Documentation</a></b> |
22+
<b><a href="https://arxiv.org/abs/2406.07658">Paper (NeurIPS 2024)</a></b>
23+
</h3>
24+
25+
26+
## Installation
27+
28+
Install Treeffuser from PyPI:
29+
30+
```bash
31+
pip install treeffuser
32+
```
33+
34+
Install the development version:
35+
36+
```bash
37+
pip install git+https://github.com/blei-lab/treeffuser.git@main
38+
```
39+
40+
The GitHub repository is located at: https://github.com/blei-lab/treeffuser
41+
42+
43+
## Usage Example
44+
45+
Here's a simple example demonstrating how to use Treeffuser.
46+
47+
We generate a heteroscedastic response with two sinusoidal components and heavy tails.
48+
49+
```python
50+
import matplotlib.pyplot as plt
51+
import numpy as np
52+
from treeffuser import Treeffuser, Samples
53+
54+
# Generate data
55+
seed = 0
56+
rng = np.random.default_rng(seed=seed)
57+
n = 5000
58+
x = rng.uniform(0, 2 * np.pi, size=n)
59+
z = rng.integers(0, 2, size=n)
60+
y = z * np.sin(x - np.pi / 2) + (1 - z) * np.cos(x) + rng.laplace(scale=x / 30, size=n)
61+
```
62+
63+
We fit Treeffuser and generate samples. We then plot the samples against the raw data.
64+
65+
```python
66+
# Fit the model
67+
model = Treeffuser(seed=seed)
68+
model.fit(x, y)
69+
70+
# Generate and plot samples
71+
y_samples = model.sample(x, n_samples=1, seed=seed, verbose=True)
72+
plt.scatter(x, y, s=1, label="observed data")
73+
plt.scatter(x, y_samples[0, :], s=1, alpha=0.7, label="Treeffuser samples")
74+
```
75+
76+
![Treeffuser on heteroscedastic data](README_example.png)
77+
78+
Treeffuser accurately learns the target conditional densities and can generate samples from them.
79+
80+
These samples can be used to compute any downstream estimates of interest:
81+
82+
```python
83+
y_samples = model.sample(x, n_samples=100, verbose=True) # y_samples.shape[0] is 100
84+
85+
# Estimate downstream quantities of interest
86+
y_mean = y_samples.mean(axis=0) # conditional mean
87+
y_std = y_samples.std(axis=0) # conditional std
88+
```
89+
90+
You can also use the `Samples` helper class:
91+
92+
```python
93+
y_samples = Samples(y_samples)
94+
y_mean = y_samples.sample_mean()
95+
y_std = y_samples.sample_std()
96+
y_quantiles = y_samples.sample_quantile(q=[0.05, 0.95])
97+
```
98+
99+
See the documentation for more information on available methods and parameters.
100+
101+
---
102+
103+
## Citing Treeffuser
104+
105+
If you use Treeffuser in your work, please cite:
106+
107+
```bibtex
108+
@article{beltranvelez2024treeffuser,
109+
title={Treeffuser: Probabilistic Predictions via Conditional Diffusions with Gradient-Boosted Trees},
110+
author={Nicolas Beltran-Velez and Alessandro Antonio Grande and Achille Nazaret and Alp Kucukelbir and David Blei},
111+
year={2024},
112+
eprint={2406.07658},
113+
archivePrefix={arXiv},
114+
primaryClass={cs.LG},
115+
url={https://arxiv.org/abs/2406.07658},
116+
}
117+
```

README.rst

Lines changed: 0 additions & 144 deletions
This file was deleted.

0 commit comments

Comments
 (0)