Skip to content

Releases: OpenMOSS/Language-Model-SAEs

v2.0.0b3

22 Nov 06:56

Choose a tag to compare

What's Changed

Full Changelog: v2.0.0b2...v2.0.0b3

v2.0.0b2

04 Nov 07:41

Choose a tag to compare

What's Changed

  • Major Refactor by @dest1n1s in #66
  • fix(misc): fix calculate_activation_norm method by @Frankstein73 in #67
  • fix(runner): fixed the issue where the wandb logger was not properly … by @Frankstein73 in #68
  • refactor(sae): decouple encoder & decoder from sae methods by @Frankstein73 in #69
  • fix misc sae.py issues (batch topk, activation func binary mask imple… by @Hzfinfdu in #70
  • feat(sae): support saving/loading dataset_average_activation_norm to/from SAE state dict by @dest1n1s in #71
  • re-implement crosscoders by @Hzfinfdu in #72
  • fix(topk activation): add keepdim=True to enable broadcasting; make d… by @Hzfinfdu in #73
  • feat(mixcoder): implemented mixcoder by @Frankstein73 in #75
  • test(sae): fix test SAE fixture; specify test weight for exact-checking forward computation by @dest1n1s in #74
  • Fix minor bugs in Crosscoders. May also affect mixcoder training by @Hzfinfdu in #76
  • fix(config): fix lr warmup & cooldown step default value by @Frankstein73 in #77
  • support mixcoder training by @Frankstein73 in #78
  • Fix bugs in mixcoder by @Frankstein73 in #79
  • add extra log info for mixcoder by @Frankstein73 in #80
  • feat(activation):enable aligned permutation of crossmodel gen by @MerlinRaptor in #82
  • refactor(activation): improve cached activation reading performance using PyTorch Dataloader by @dest1n1s in #84
  • feat(runner): support training with non-pre-generated activation by @dest1n1s in #85
  • feat(mixcoder): changed loss calculation method and added more log in… by @Frankstein73 in #86
  • Anthropic Jan Jumprelu; triton spmm decoding; miscs by @Hzfinfdu in #89
  • fix(activation): preserve tokens type during dtype conversion by @Frankstein73 in #90
  • implement dynamic dispatch for sae instantiation in analyze_sae(runner) by @MerlinRaptor in #91
  • fix(analyze): misc things stopping analysis working by @dest1n1s in #92
  • fix backend missing-shard-idx bug which causes incorrect context inde… by @Hzfinfdu in #93
  • feat(kernels): support spmm triton kernel for topk saes. by @Hzfinfdu in #94
  • fix(cached_acts): re-implement changes mistakenly removed in 71ff9f9 by @Hzfinfdu in #95
  • fix(kernel): fix speed degradation of TopK kernel by @Hzfinfdu in #96
  • Major Update: Add Crosscoder; Refactor Training & Parallelism by @dest1n1s in #103
  • Major Update: Add Crosscoder; Refactor Training & ParallelismDynamics by @dest1n1s in #104
  • Add support for LLaDA by @Frankstein73 in #105
  • feat(trainer): remove l0_based decoder weight learning rate by @Frankstein73 in #107
  • Dynamics Updates by @dest1n1s in #106
  • feat(config): add prepend_bos option to LanguageModelConfig by @Frankstein73 in #108
  • chore: update submodule and refactor configuration by @Frankstein73 in #109
  • chore: update TransformerLens submodule and remove MixCoder references by @Frankstein73 in #110
  • Introduce CLT implementation, supports distributed settings by @Hzfinfdu in #111
  • feat(trainer): support param groups by @dest1n1s in #112
  • implements distributed cached activation loading; fix layernorm hook_normalized precision for rms_norm by @Hzfinfdu in #113
  • feat(activation): support checking activation consistency by @dest1n1s in #114
  • feat(language_model): enhance activation processing and configuration by @Frankstein73 in #115
  • fix(ci): ignore all import errors; temporarily remove unit tests checking; fix ci dependency installation by @dest1n1s in #116
  • Add metric filtering by @dest1n1s in #117
  • Support LLaDA Feature Analysis and Distributed Feature Analyzer by @Frankstein73 in #118
  • Fix data parallelism by @dest1n1s in #119
  • Fix batch size validation for data parallelism and adjust total count for activation processing in CachedActivationLoader by @Frankstein73 in #120
  • feat(analysis): reimplement DirectLogitAttributor and related configurations by @dest1n1s in #121
  • dynamics by @dest1n1s in #122
  • Major update: Add CLT, LoRSA and MoLT; Refactor AbstractSAE interfaces for compatibility by @dest1n1s in #126
  • molt implementation by @MerlinRaptor in #124

New Contributors

  • @MerlinRaptor made their first contribution in #82

Full Changelog: v1.1.0...v2.0.0b2

v1.1.0

31 Dec 08:46

Choose a tag to compare

This version is stable in replicating Llamascope SAEs and most canonical SAEs. Not fully supporting Crosscoders.

v1.0.0

01 Nov 12:24

Choose a tag to compare

Stable Llama Scope Implementation

v0.1.0

25 Oct 15:09

Choose a tag to compare

v0.1.0 Pre-release
Pre-release
feat(ui/circuit): show attention pattern