Skip to content
Open
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
a4bfa45
init taylor_seer cache
Nov 13, 2025
fe20f97
Merge branch 'main' into feat-taylorseer
toilaluan Nov 13, 2025
8f495b6
make compatible with any tuple size returned
Nov 13, 2025
8f80072
use logger for printing, add warmup feature
Nov 13, 2025
0602044
still update in warmup steps
Nov 13, 2025
1099e49
refractor, add docs
Nov 14, 2025
7b4ad2d
add configurable cache, skip compute module
Nov 14, 2025
51b4318
allow special cache ids only
Nov 15, 2025
7238d40
add stop_predicts (cooldown)
Nov 16, 2025
acfebfa
update docs
Nov 17, 2025
9290b58
Merge branch 'main' into feat-taylorseer
toilaluan Nov 17, 2025
d929ab2
apply ruff
Nov 17, 2025
05f61a9
Merge branch 'main' into feat-taylorseer
toilaluan Nov 20, 2025
9083e1e
update to handle multple calls per timestep
Nov 20, 2025
a8ea383
refractor to use state manager
Nov 25, 2025
b321713
Merge branch 'main' into feat-taylorseer
toilaluan Nov 25, 2025
2be31f8
fix format & doc
Nov 25, 2025
a644417
Merge branch 'huggingface:main' into feat-taylorseer
toilaluan Nov 26, 2025
656c7bc
Merge branch 'main' into feat-taylorseer
toilaluan Nov 26, 2025
24267c7
chores: naming, remove redundancy
Nov 28, 2025
83b6253
add docs
Nov 28, 2025
309ce72
quality & style
Nov 28, 2025
d06c6bc
fix taylor precision
Nov 28, 2025
ddc6164
Merge branch 'main' into feat-taylorseer
toilaluan Nov 28, 2025
716dfe1
Apply style fixes
github-actions[bot] Nov 28, 2025
e2dae7e
add tests
Nov 29, 2025
289146e
Apply style fixes
github-actions[bot] Nov 30, 2025
475ec02
Remove TaylorSeerCacheTesterMixin from flux2 tests
toilaluan Dec 1, 2025
4fb3f53
rename identifiers, use more expressive taylor predict loop
Dec 3, 2025
76494ca
torch compile compatible
Dec 4, 2025
ca24569
Apply style fixes
github-actions[bot] Dec 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/source/en/api/cache.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,9 @@ Cache methods speedup diffusion transformers by storing and reusing intermediate
[[autodoc]] FirstBlockCacheConfig

[[autodoc]] apply_first_block_cache

### TaylorSeerCacheConfig

[[autodoc]] TaylorSeerCacheConfig

[[autodoc]] apply_taylorseer_cache
27 changes: 27 additions & 0 deletions docs/source/en/optimization/cache.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,4 +66,31 @@ config = FasterCacheConfig(
tensor_format="BFCHW",
)
pipeline.transformer.enable_cache(config)
```

## TaylorSeer Cache

[TaylorSeer Cache](https://huggingface.co/papers/2403.06923) accelerates diffusion inference by using Taylor series expansions to approximate and cache intermediate activations across denoising steps. This method predicts future outputs based on past computations, reusing them over specified intervals to reduce redundant calculations.

It supports selective module skipping (inactive mode), where certain modules return zero tensors during prediction steps to skip computations cheaply, and a lightweight "lite" mode for optimized memory usage with predefined patterns for skipping and caching.

Set up and pass a [`TaylorSeerCacheConfig`] to a pipeline's transformer to enable it. The `cache_interval` controls how many steps to reuse cached outputs before refreshing with a full forward pass. The `disable_cache_before_step` specifies the initial steps where full computations are performed to gather data for approximations. Higher `max_order` improves approximation accuracy but increases memory usage.

```python
import torch
from diffusers import FluxPipeline, TaylorSeerCacheConfig

pipe = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev",
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")

config = TaylorSeerCacheConfig(
cache_interval=5,
max_order=1,
disable_cache_before_step=10,
taylor_factors_dtype=torch.bfloat16,
)
pipe.transformer.enable_cache(config)
```
4 changes: 4 additions & 0 deletions src/diffusers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,10 +169,12 @@
"LayerSkipConfig",
"PyramidAttentionBroadcastConfig",
"SmoothedEnergyGuidanceConfig",
"TaylorSeerCacheConfig",
"apply_faster_cache",
"apply_first_block_cache",
"apply_layer_skip",
"apply_pyramid_attention_broadcast",
"apply_taylorseer_cache",
]
)
_import_structure["models"].extend(
Expand Down Expand Up @@ -890,10 +892,12 @@
LayerSkipConfig,
PyramidAttentionBroadcastConfig,
SmoothedEnergyGuidanceConfig,
TaylorSeerCacheConfig,
apply_faster_cache,
apply_first_block_cache,
apply_layer_skip,
apply_pyramid_attention_broadcast,
apply_taylorseer_cache,
)
from .models import (
AllegroTransformer3DModel,
Expand Down
1 change: 1 addition & 0 deletions src/diffusers/hooks/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,4 @@
from .layerwise_casting import apply_layerwise_casting, apply_layerwise_casting_hook
from .pyramid_attention_broadcast import PyramidAttentionBroadcastConfig, apply_pyramid_attention_broadcast
from .smoothed_energy_guidance_utils import SmoothedEnergyGuidanceConfig
from .taylorseer_cache import TaylorSeerCacheConfig, apply_taylorseer_cache
Loading
Loading